SPSS SUMMARY STATISTICAL MODELLING
CHAPTER 1
INFERENTIAL STATISTICS
Inferential statistics offers techniques for making statements about a larger set of observations
from data collected for a smaller set of observations
The large set of observations about which we want to make a statement is the population
The smaller set is called a sample
Sample statistic: a value (number) describing a characteristic of a sample
the number of yellow candies in a bag
Sampling space: all possible sample statistic values
the collection of all possible outcomes
Numbers 0 to 10 are the sampling space of the sample statistic number of yellow candies
Random variable: a variable with values that depend on chance
The sample statistic is called a random sample because different samples can have different
scores the value of a variable may vary from sample to sample
Sampling distribution: all possible sample statistic values and their probabilities
the distribution of the outcome scores of very many samples
Unbiased estimator: a sample statistic for which the expected value equals the population statistic
(parameter)
CHAPTER 2
BOOTSTRAPPING
In bootstrapping, we only draw one sample from the population for which we collect data
As a next step, we draw a large number of samples from our initial sample
the samples drawn in the second step are called bootstrap samples
For each bootstrap sample, we calculate the sample statistic of interest, and we collect these
as our sampling distribution
Bootstrap samples are statistically sampled with replacement from the original sample, so one
bootstrap sample may differ from another
BOOTSTRAPPING
Analyze > Compare Means > Independent-Samples T Test >
Put dependent variable in Test Variable box
Put independent variable in Grouping variables box > Define Groups
Select Bootstrapping > check Perform bootstrapping > usually 5000 > click on Bias correct accelerated
Interpreting results:
Independent samples test
P is above 0.5 Levene’s test on homogeneity of variances is not significant may assume that the
population variances of the 2 groups are equal interpret the top row of the bootstrap table
Bootstrap for Independent Samples Test
Check Mean Differences if very little value, very small differences
Check confidence interval we can say that red candies can be on average 0.11 grams lighter than
yellow candies or up to 0.21 grams heavier, we cannot
tell which of the 2 are heavier in the population with
sufficient confidence, because there is a zero in the
confidence interval
,LIMITATION OF BOOTSTRAPPING
For the bootstrapped sampling distribution to resemble the true sampling distribution, we must draw
large samples
Besides being large, the Sample must also be nearly representative of the population
But we can never know whether our sample is representative of the population, and this is the
biggest limitation of the bootstrap approach.
Bootstrapping is the only method we have to retrieve a sampling distribution for the sample
median
EXACT APPROACH
The exact approach lists and counts all possible combinations, which can only be down with
categorical or discrete variables
We can use the exact approach on categorical variables because they have a LIMITED
NUMBER of values; numerical variables it is impossible to list all possible outcomes
However, if there are many categories, this approach can take a lot of computing power.
FISHER EXACT-TEST
Analyze > Descriptive > Crosstabs >
Click on Exact > press Exact > set a time limit (5 minutes) press continue
Click on Statistics > press Chi-square and Phi and Cramer’s V
Click on Cells > under percentages press Column
Interpretating results:
Crosstabulation
If you look at the percentages in the contingency table, you see that
yellow and red candies are often less sticky than blue, green and orange
candies
Chi-square tests
Look at p-value Fisher-exact test is at 0.1, which is below 0.5 test is
statistically significant
Symmetric measures
There is a strong association (Cramer’s V =.52) between candy color and stickiness, this test is
statistically significant
Bootstrapping and exact tests can be used if the conditions for
theoretical probability distributions have not been met
CHAPTER 3
Our best guess for the population value is the sample statistic
this type of guess is referred to as a point estimate
Although it is our best guess, it is likely that this guess is wrong
For this reason, it is better to not use a point estimate but to estimate a range of values the
population may fall into
CONFIDENCE INTERVAL
The confidence interval is the upper and lower bounds for plausible population means
This is how you phrase the confidence interval: “we are 95% confident that average
candy weight in the population is between 2.4 and 3.2 gram”
A confidence interval is likely to contain the true population mean
, Usually, we automatically get the confidence interval, but for some tests (F tests) we don’t
SETTING THE CONFIDENCE LEVEL
Analyze > Compare Means > One-way ANOVA
Post Hoc > Select Bonferroni
For correlations, you have to look at the bootstrapping > there the confidence level can be adjusted
The wider our interval in the sampling distribution, the less precise our estimate is
There are several methods we can use to increase the precision of our estimate
Simplest way: decrease our confidence interval, but usually does not work, so we usually keep
our confidence level at 95%
Another way: increase sample size, whereas larger sizes lead to greater precision just because
there are more observations, it is less likely that we sample relatively high or low scores this
leads to a more peaked sampling distribution
The standard error of our sampling distribution informs us of the precision of your interval estimate
The standard error is the sampling distribution’s standard deviation (SD = the average
distance of an observation from the mean)
The smaller the standard error, the more likely the sample statistic resembles the population value
To decrease the standard error, you should increase the sample size a more peaked
distribution, will make the standard error smaller
CHAPTER 4
To conduct a statistical test for our hypothesis, we need 4 elements:
- Statement about a population
- A sample from the population
- A criterion to judge whether the statement is plausible
- A probability for the statement
BINOMINAL TEST
If your hypothesis addresses the share of a category in a population, you want to use a proportion as
your test statistic
for instance, if your hypothesis is that a television station reaches half of the people, you
want to test a proportion value, namely .50. In this case the null hypothesis: the proportion
of people in the population the television station reaches is .5.
We want to use a binomial test for this hypothesis
BINOMINAL TEST SINGLE PROPORTION
Analyze > Nonparametric Tests > Legacy Dialogs > Binomial
Select variable > Cut Point: select lowest category or if a variable contains more than two
values (household income) select the highest value of the lowest category
Test proportion select the proportion of the same variable as cut point
Interpretation results:
Binomial Test
Group 1 contains 58 participants out of all the 120 households, which 48%
The hypothesis: 35% of all households cannot receive the tv station
The P value is 0.002 so we reject the null hypothesis
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller flipsejans. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $13.05. You're not tied to anything after your purchase.