SPSS SUMMARY STATISTICAL MODELLING
CHAPTER 1
INFERENTIAL STATISTICS
Inferential statistics offers techniques for making statements about a larger set of observations
from data collected for a smaller set of observations
The large set of observations about which we want to make a statement is the population
The smaller set is called a sample
Sample statistic: a value (number) describing a characteristic of a sample
the number of yellow candies in a bag
Sampling space: all possible sample statistic values
the collection of all possible outcomes
Numbers 0 to 10 are the sampling space of the sample statistic number of yellow candies
Random variable: a variable with values that depend on chance
The sample statistic is called a random sample because different samples can have different
scores the value of a variable may vary from sample to sample
Sampling distribution: all possible sample statistic values and their probabilities
the distribution of the outcome scores of very many samples
Unbiased estimator: a sample statistic for which the expected value equals the population statistic
(parameter)
CHAPTER 2
BOOTSTRAPPING
In bootstrapping, we only draw one sample from the population for which we collect data
As a next step, we draw a large number of samples from our initial sample
the samples drawn in the second step are called bootstrap samples
For each bootstrap sample, we calculate the sample statistic of interest, and we collect these
as our sampling distribution
Bootstrap samples are statistically sampled with replacement from the original sample, so one
bootstrap sample may differ from another
BOOTSTRAPPING
Analyze > Compare Means > Independent-Samples T Test >
Put dependent variable in Test Variable box
Put independent variable in Grouping variables box > Define Groups
Select Bootstrapping > check Perform bootstrapping > usually 5000 > click on Bias correct accelerated
Interpreting results:
Independent samples test
P is above 0.5 Levene’s test on homogeneity of variances is not significant may assume that the
population variances of the 2 groups are equal interpret the top row of the bootstrap table
Bootstrap for Independent Samples Test
Check Mean Differences if very little value, very small differences
Check confidence interval we can say that red candies can be on average 0.11 grams lighter than
yellow candies or up to 0.21 grams heavier, we cannot
tell which of the 2 are heavier in the population with
sufficient confidence, because there is a zero in the
confidence interval
,LIMITATION OF BOOTSTRAPPING
For the bootstrapped sampling distribution to resemble the true sampling distribution, we must draw
large samples
Besides being large, the Sample must also be nearly representative of the population
But we can never know whether our sample is representative of the population, and this is the
biggest limitation of the bootstrap approach.
Bootstrapping is the only method we have to retrieve a sampling distribution for the sample
median
EXACT APPROACH
The exact approach lists and counts all possible combinations, which can only be down with
categorical or discrete variables
We can use the exact approach on categorical variables because they have a LIMITED
NUMBER of values; numerical variables it is impossible to list all possible outcomes
However, if there are many categories, this approach can take a lot of computing power.
FISHER EXACT-TEST
Analyze > Descriptive > Crosstabs >
Click on Exact > press Exact > set a time limit (5 minutes) press continue
Click on Statistics > press Chi-square and Phi and Cramer’s V
Click on Cells > under percentages press Column
Interpretating results:
Crosstabulation
If you look at the percentages in the contingency table, you see that
yellow and red candies are often less sticky than blue, green and orange
candies
Chi-square tests
Look at p-value Fisher-exact test is at 0.1, which is below 0.5 test is
statistically significant
Symmetric measures
There is a strong association (Cramer’s V =.52) between candy color and stickiness, this test is
statistically significant
Bootstrapping and exact tests can be used if the conditions for
theoretical probability distributions have not been met
CHAPTER 3
Our best guess for the population value is the sample statistic
this type of guess is referred to as a point estimate
Although it is our best guess, it is likely that this guess is wrong
For this reason, it is better to not use a point estimate but to estimate a range of values the
population may fall into
CONFIDENCE INTERVAL
The confidence interval is the upper and lower bounds for plausible population means
This is how you phrase the confidence interval: “we are 95% confident that average
candy weight in the population is between 2.4 and 3.2 gram”
A confidence interval is likely to contain the true population mean
, Usually, we automatically get the confidence interval, but for some tests (F tests) we don’t
SETTING THE CONFIDENCE LEVEL
Analyze > Compare Means > One-way ANOVA
Post Hoc > Select Bonferroni
For correlations, you have to look at the bootstrapping > there the confidence level can be adjusted
The wider our interval in the sampling distribution, the less precise our estimate is
There are several methods we can use to increase the precision of our estimate
Simplest way: decrease our confidence interval, but usually does not work, so we usually keep
our confidence level at 95%
Another way: increase sample size, whereas larger sizes lead to greater precision just because
there are more observations, it is less likely that we sample relatively high or low scores this
leads to a more peaked sampling distribution
The standard error of our sampling distribution informs us of the precision of your interval estimate
The standard error is the sampling distribution’s standard deviation (SD = the average
distance of an observation from the mean)
The smaller the standard error, the more likely the sample statistic resembles the population value
To decrease the standard error, you should increase the sample size a more peaked
distribution, will make the standard error smaller
CHAPTER 4
To conduct a statistical test for our hypothesis, we need 4 elements:
- Statement about a population
- A sample from the population
- A criterion to judge whether the statement is plausible
- A probability for the statement
BINOMINAL TEST
If your hypothesis addresses the share of a category in a population, you want to use a proportion as
your test statistic
for instance, if your hypothesis is that a television station reaches half of the people, you
want to test a proportion value, namely .50. In this case the null hypothesis: the proportion
of people in the population the television station reaches is .5.
We want to use a binomial test for this hypothesis
BINOMINAL TEST SINGLE PROPORTION
Analyze > Nonparametric Tests > Legacy Dialogs > Binomial
Select variable > Cut Point: select lowest category or if a variable contains more than two
values (household income) select the highest value of the lowest category
Test proportion select the proportion of the same variable as cut point
Interpretation results:
Binomial Test
Group 1 contains 58 participants out of all the 120 households, which 48%
The hypothesis: 35% of all households cannot receive the tv station
The P value is 0.002 so we reject the null hypothesis
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper flipsejans. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €11,99. Je zit daarna nergens aan vast.