Samenvatting

Samenvatting Beschrijvende en Inferentiele statistiek (BIS)

0 keer verkocht

Vak
Beschrijvende en inferentiële statistiek (S_PMBIS)

Instelling
Vrije Universiteit Amsterdam (VU)

Boek
Statistics: The Art and Science of Learning from Data

Gebaseerd op de premaster, (PMBIS) van februari 2023. Ook geschikt voor de bachelor maar mogelijk iets meer stof die dan nog niet verplicht is.

[Meer zien]

Voorbeeld 4 van de 32 pagina's

Bekijk voorbeeld

Heel boek samengevat? Nee
Wat is er van het boek samengevat? 1,2,3,6,7,9,10, 11, 12
Geupload op 29 oktober 2023
Aantal pagina's 32
Geschreven in 2023/2024
Type Samenvatting

statistics
statsitiek
premaster
beschrijvende
inferentiele
vu
vrije universiteit

Titel boek:Statistics: The Art and Science of Learning from Data

Auteur(s):Alan Agresti, Christine A. Franklin

Uitgave:januari 2017
ISBN:9781292164779
Druk:1

Samenvatting
beschrijvende statistiek; quizlet flashcards/flitskaartjes met alle begrippen en theorieën (premaster orthopedagogiek / bachelor pedagogische wetenschappen)
Samenvatting
Samenvatting Beschrijvende en Inferentiële Statistiek (S_PMBIS) deeltentamen 2
Samenvatting
Samenvatting Statistics: The Art and Science of Learning from Data - Statistiek 1

Instelling
Vrije Universiteit Amsterdam (VU)
Studie
Communicatiewetenschap
Vak
Beschrijvende en inferentiële statistiek (S_PMBIS)

Volgen

maraoltmans1

Lid sinds 1 jaar 3 documenten verkocht

€6,49

In winkelwagen

Op verlanglijstje

100% tevredenheidsgarantie
Direct beschikbaar na je betaling
Lees online óf als PDF
Geen vaste maandelijkse kosten

Chapter 1
1.1 Using data to answer statistical questions
Statistics: the art and scence of designig studies and analyzing the data that those studies
produce. Its ultimate goal is translating data into kwowledge and understanding of the world
around us. In short: he art and science of learning from data.

Statistical methods helps us investigate questions in an objective manner. Statistical problem
solving is an investigative process that involves four componens:
1. Formulate a statistical question
2. Collect data
3. Analyze data
4. Interpret results

Reasons for using statistical methods
There are three main components of statistics for answering a statistical question:
1. Design: stating the goal and or statistical question of interest and planning how to
obtain data that will address them
2. Description: summarizing and analyzing the data that are obtained
a. Exploring and summarizing patterns in the data
3. Inference: Making decisions and predictions based on the data for answering the
statistical question
a. Usually, the decision or prediction refers to a larger group of people
b. Description: stating the percentages for the sample of voters
c. Inference: predicting the outcome for all voters

Probability: a framework for quantifying how likely various possible outcomes are.
Variable: the charateristic being measured, such as number of hours per day that you watch
TV

1.2 Sample versus population
Subjects: the entities that we measure in a study, such as peopl or countries
Population: the set of all subjects in which we are interested
Sample: the data we have from the subjects who belong to the population (we don't always
have data from every subject)

Descriptive statistics and inferential statistics
Descriptive statistics: methods for summarizing the collected data (where data constitutes
either a sample or a population). The summaries usually consist of graphs and numbers such
as averages and percentages.
Inferential statistics: methods of making decisions or predictions about a population, based
on data obtained from a sample of that population

In most surveys, we have data for a sample, not for the entire population. We use descriptive
statistics to summarize the sample data and inferential statistics to make predictions about
the population.

An important aspect of statistical inference involves reporting the likely precision of a
prediction. How clos is the sample value likely to be to the true percentage of the
population?

Sample statistics and population parameters

,Sample statistic: the percentage of the sample
Parameter: numerical summary of the population.
Statistic: numerical summary of a sample taken from the population.

Randomness and variability
Random sampling: designed to make the sample representative of the population.

Estimation from surveys with random sampling
Margin of error: a measure of the expected variability from one random sample to the next
random sample.

In statistics, we let n denote the number of subjects in the sample.

Testing and statistical significance
In a randomized experiment, the variation that could be expected to occur just by chance
alone is rougly like the margin of error with simple random sampling. The difference
expected due to ordinary variation is smaller with larger samples. When the difference
between the results for the two treatments is so large that it would be rare to see such a
difference by ordinary random variation, we say that the results are statistically significant. --
> the larger the sample size, the better

1.3 Using calculators and computers

Chapter 2: Exploring Data with Graphs and Numerical
Summaries
2.1 different types of data
Variables: any characteristic observed in a study
Observations: the data values that we observe for a variable
 Number > a variable is quantitative if observations on it take numerical values that
represent different magnitudes of the variable
 Category > a variable is called categorical if each observation belongs to one of a set
of distinct categories

Quantitavie variables
 Key feature: the center and the variability (spread) of data

Categorical variables
 A key feuture is the relative number of observ ations

Quantitative variables are discrete or continuous
 Discrete variable: if its possible values form a set of separate numbers (number of
pets in a household
 Continuous: If its possible values form an interval (height, weight)

Distribution of a variable
The first step in analyzing data collected on variable is to look a the observed values by using
graphs and numerical summaries.
 Distribution: describes how the observations fall (are distributed) across the range of
possible values.

,Features to look for in the distribution of a categorical vairbale:
 The category with the largest frequency (modal category)
 How frequently each category was observed

Features to look for in the distribution of a quantitative variable:
 Shape (do observations cluster in certain intervals
 Center (where does a typical observation fall?)
 Variability (how tightly are the observations clustering around a center?

Frequency table
Frequency table: a listing for possible values for a variable, together with the number of
observations for each value.

Proportion: the number of observations in that category divided by the total number of
observations.
Percentage: the proportion multiplied by 100.

These are also called relative frequencies.

To show the distribution for a discrete quantitative variable, we would list thedistinct values
and gthe frequency of each one occurring.

For a continuous quantitative variable, we divide the numerica scale in intervals and count
the number of observations falling in each interval.

2.2 Graphical summaries of data
Graphs for categorical variables
 Pie chart
 Bar graph
o A bar graph with categories ordered by their frequency: pareto chart
 Pareto principle: a small subset of categories often contains most of
the observations

Graphs for quantitative variables
 Dot plot: shows a dot for each observation
 Stem-and-leaf plot: each observation is represented by a stem and a leaf. Stem
consists of all the digits except for the final one, which is the leaf. (zie vb op p. 64)
 Histograms: a graph that uses bars to portray the frequencies or the relative
frequencies of the possible outcomes for a quantitave variable (let op: bij
kwantitative variabelen heet het een histogram, een staafdiagram bij categorische
variabelen)

The shape of a distribution
 Unimodel: single mound or peak
o Most often the peak is the mode
 Bimodal: two distinct mounds
 Shape: symmetric or skweded (to the right or left)
 Gap: is there a gap that one or more observations notiveably deviate from the rest?
 Tails: the parts of the curve for the lowest and for the highest values.

, Time plots: displaying data over time
Time series: a data set collected over time
Time plot: a way to display time-series graphically. This charts each observation on the
vertical scale against the time it was measured.
Trend: tendency of the data to rise or fall

2.3 Measuring the center of quantitative data
Describing the center: the mean and median
Mean: the sum of the observations divided by the number of observations (gemiddelde)
Median: the middle value of the observations when the observations are ordered from the
smallest to the largers (or from largest to the smallest)

Sample size: n

Basic properties of the mean:
 Balance point of data
 The mean is not equal to any value that was observed in the sample (usually)
 For a skewed distribution, the mean is pulled in the direction of the longer tail
 The mean can be highly influenced by an outlier
o Outlier: an observation that falls well above or well below the overall bulk of
the data

Comparing the mean and median
If the shape is:
 Symmetric, the mean equals the median
 Skewed to the left, the mean is smaller than the median
 Skewed to the right, the mean is larger than the median.

The median is resistant to the effect of extreme observations: a numerical summary of the
observations is call resistant if extreme observations have little, if any, influence on its value.

The mode
The mode: the value that occurs most frequently. For continuous observations, it is usually
not meaningful to look for a mode because there can be multiple modes or no mode at all.

2.4 Measuring the variability of quantitative data
Measuring variability: the range
Range: the difference between the largest and the smallest observations
The range is not a resistant statistic. It shares the worst property of the mean, not being
resistant, and the worst property of the median, ignoring the numerical values of nearly all
the data.

Measuring variability: the standard deviation
A much nbeter numerical summary of variability uses all the data, and it describes a typical
distance of how far the data falls from the mean. It does this by summarizing deviations from
the mean.

Dit zijn jouw voordelen als je samenvattingen koopt bij Stuvia:

Bewezen kwaliteit door reviews

Studenten hebben al meer dan 850.000 samenvattingen beoordeeld. Zo weet jij zeker dat je de beste keuze maakt!

In een paar klikken geregeld

Geen gedoe — betaal gewoon eenmalig met iDeal, Bancontact of creditcard en je bent klaar. Geen abonnement nodig.

Focus op de essentie

Studenten maken samenvattingen voor studenten. Dat betekent: actuele inhoud waar jij écht wat aan hebt. Geen overbodige details!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper maraoltmans1. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €6,49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 69411 samenvattingen verkocht

Opgericht in 2010, al 15 jaar dé plek om samenvattingen te kopen

Start met verkopen

Samenvatting

Samenvatting Beschrijvende en Inferentiele statistiek (BIS)

Document informatie

Onderwerpen

Gekoppeld boek

Meer samenvattingen voor studieboek

Geschreven voor

Verkoper

Ontvangen beoordelingen

Voorbeeld van de inhoud