100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
Total summary Applied multivariate data analysis - Lectures, exercises, Field, article $7.58   Add to cart

Summary

Total summary Applied multivariate data analysis - Lectures, exercises, Field, article

1 review
 204 views  9 purchases
  • Course
  • Institution
  • Book

This summary contains ALL important information from the lectures, book chapters from Field, exercises and the article of Simmons et al (2011). I used the lectures as a format and added information from all the sources named above, to cover everything. This summary gives you a clear overview of all...

[Show more]

Preview 4 out of 54  pages

  • No
  • Chapter 2, 3, 6, 8, 9, 11, 12, 13, 14, 15, 16
  • January 30, 2021
  • 54
  • 2020/2021
  • Summary

1  review

review-writer-avatar

By: chantallyrosini • 3 year ago

avatar-seller
Definitions: underlined
Listing: in italics
Analysis or assumption: in bold
Summary Statistics

Null-hypothesis testing, statistical estimation, research ethics
Lecture 1 + tutorial 1 + Field 2, 3, 6 + Simmons et al. (2011)

 Statistical models:
o Variables: are measured constructs and vary across people in the sample
o Parameters: are constant relationships that we infer from our data. They are
bits and pieces of your model that allow you to represent the invariant
relationships in your data. They act on variables. We compute the model
parameters in the sample to estimate the value in the population.
 Normal distribution: two parameters  mean and SD
 Line: two parameters  The slope and intercept
 Several definitions:
o SS, s2, and SD all represent how well the mean fits the observed sample data.
Large values (relative to the scale of measurement) suggest the mean is a
poor fit of the observed scores, and small values suggest a good fit.
o Sums of Squares (SS)
 Residual sum of squares (SSR) = the degree of inaccuracy
when the best model is fitted to the data. Sum of squared
errors is a ‘total’ and is, therefore, affected by the number of data
points. Squared because residuals can be negative. Df are the number
of observations.
 Total sum of squares (SST) = squared difference between the observed
values and the predicted
 Model sum of squares (SSM) = Improvement in accuracy of the model
fitting to the data. If large, the linear model is very different from using
the mean to predict the outcome variable. Df are the number of
predictors.
 Used in R2 and F
o Variance (s2)
 the ‘average’ variability but units squared
o Standard deviation (SD/s)
 average variation but converted back to the original
units of measurement
 Tells us how much observations in our sample differ from the mean
value within our sample.
 The square root of the variance (s2)
o Standard Error (SE) (same as (SEx̅))
 For a given statistic (e.g. the mean) it tells us how much variability
there is in this statistic across samples from the same population.
 the width of the sampling distribution
 standard error for b tells us how different b would be across samples

, o Standard Error of the mean (SEx̅)
 the SD of sample means
 How well the sample mean represents the population
mean
 Central limit theorem: for sample of at least 30, the sampling
distribution of sample means is a normal distribution with mean (μ)
and standard deviation (σ).
 The more variability in the population, the larger the SE.
 The smaller your sample size, the larger the SE
o Mean sum of squares or MS = SS / df
 Makes SSR and SSM comparable
o P-value:
 indicates the probability of finding the current sample result or more
extreme when the null hypothesis is true.
 P-values depend on sample size
 Error
o Different ways to quantify this:
 Sum of Squared errors (SSE) = if we add the error for each person/data
point
 Mean Squared Error (MSE): When the model is the mean, de MSE is
called variance. The more cases, the larger your squared error. So you
divide it by df to get an average.
 MSE = SS / df.
 SD/s
 From sample to population
o Sample:
 Mean (x̅)
 SD (s)
o Population:
 Mean (μ)
 SD (σ)
o Using SD and sample size, we can determine how accurate our x̅ is compared
to μ.
o Sampling distribution: Frequency distribution of sample means from the same
population
 one sample will provide an estimate of the true population parameter
 If the range is large, we have less confidence
 It depends on the sample size + and variability (SD) of the trait in the
population
 Standard Error and CI
o 95% CI: for 95% of all possible samples the population mean (μ) will be within
its limits and 5% will not contain the population mean.
o Each time we conduct a significance test we take a 5% risk at rejecting H0
falsely.
o 95% CI calculated by assuming the t-distribution as representative of the
sampling distribution.

,  t-distribution looks like standard normal distribution, but fatter tails
depending on the df (here df = N - 1).
o Calculation of CI
 First calculate SE
 Calculate df (N – 1) and look up appropriate t-value
 LL: CI = x̅ − (t × SE)
 UL: CI = x̅ + (t × SE)
 95% of z-scores fall between -1.96 and 1.96, so if sample means a re
normally distributed with a mean of 0 and a SD of 1, the limits of the
CI would be -1.96 and 1.96. But for small samples the distribution is
not normal, but it has a t-distribution.
 So to construct for small samples a CI we use t instead of z.
o “margin of error” (t(df)×SE) of the mean is smaller in larger samples.
o When 0 is not in the CI, we can conclude that our result is significantly larger
than 0. H0 is rejected.
o The logic of CI is a way of explaining the sampling distribution, which is exactly
something we use in order to test things. You can also refrain this in terms of
the null hypothesis and alternative hypothesis, which is the same logic.
 Null hypothesis significance testing: NHST
o NHST evaluates the probability of getting a statistic at least as large as the one
you have given the H0.
o H0: There is no effect.
 H0: μ = 0
o H1/Ha: The experimental hypothesis
 H0: μ ≠ 0
o We cannot talk about the H0 or Ha being true, we can only speak in terms of
the probability of obtaining a particular result or statistic if, hypothetically
speaking, the H0 were true.
o The test statistic is significant (p < .05): we reject our null hypothesis when we
find our sample result unlikely, when H0 would be true.
o Any value outside the 95% CI has p < .05, any value inside the 95% has p > .05
o Different types of hypotheses require different test statistics:
 hypotheses concerning one or two means are tested with the t-test
 hypothesis concerning several means are tested with the F-test
o Look at critical values of the t-distribution: Calculate the t-statistic  compare
that observed t-value to the critical t-value The observed value is much
larger here than the critical t-value, so it’s very significant.
o When testing one sided the p-value must be divided by 2. It is however
advised against because of the increased risk at type I error.
 CI and NHST
o If you want to be more confident about the location of the population mean,
you have to cover a larger area (so a larger CI, like 99%).
o When our H0 concerns one population mean, (e.g. H0: μ = 0)
 NHST = one-sample t-test.
o When our H0 concerns the difference between two independent population
mean, (e.g. H0: μ1 – μ2 = 0),
 NHST = independent-samples t-test.

, o Three guidelines (Cummin & Finch) for the relationship between CI and NHST:
1. 95% CIs that just about touch end-to-end represent a p-value for testing
H0: µ1 = µ2 of approximately .01
 If two 95% CI’s of group means do not overlap, H0: µ1 = µ2 can
be rejected with p < .01. We say that it is highly unlikely that the
two means come from the same population.
 When an experimental manipulation is successful, we expect to
find that our samples have come from different populations. If the
manipulation is unsuccessful, then we expect to find that the
sample came from the same population.
2. If there is a gap between the upper limit of one 95% CI and the lower limit
of another then p < .01.
3. A p-value of .05 is represented by moderate overlap between the bars
(approximately half the value of the margin of error).
 When the two CIs would overlap more than half the (average)
margin of error (i.e. distance from the mean to the upper or lower
limit), we would not reject H0: µ1 = µ2.
 Effect Size
o p-values just tell you something about probability.
o Significance ≠ importance
 To quantify importance: effect size
o Effect size = magnitude of an observed effect
o An effect size is a standardized measure of the size of an effect:
 Standardized = comparable across studies
 Not (as) reliant on the sample size
o Cohen’s d: if we’re looking at differences between groups (and sometimes
within groups)
o Pearson’s r or R-squared: if we’re looking at continuous variables,
correlations. (Or between one continuous and a categorical variable
containing 2 categories)
o (partial) eta-squared: multiple variables in our analysis. It looks a lot like R
o Odds ratio: 2 or more categorical variables
o Rules of thumb for interpreting ES
 r = .1, d = .2 (small effect)
 the effect explains 1% of the total variance.
 r = .3, d = .5 (medium effect)
 the effect accounts for 9% of the total variance.
 r = .5, d = .8 (large effect)
 the effect accounts for 25% of the variance.
o Effect sizes are standardized based on the standard deviation, whereas test
statistics divide the raw effect by the standard error.
 Thus, small effects can be statistically significant as long as the sample
is large. As a consequence, statistically significant effects are not
always practically relevant.
 It is recommended to report p-values, CI’s and effect size, because the
three measures provide complementary information.
 Type I and type II errors

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller dpchardeman. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $7.58. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

64438 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$7.58  9x  sold
  • (1)
  Add to cart