100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
Total summary Applied multivariate data analysis - Lectures, exercises, Field, article €6,99   In winkelwagen

Samenvatting

Total summary Applied multivariate data analysis - Lectures, exercises, Field, article

1 beoordeling
 204 keer bekeken  9 keer verkocht

This summary contains ALL important information from the lectures, book chapters from Field, exercises and the article of Simmons et al (2011). I used the lectures as a format and added information from all the sources named above, to cover everything. This summary gives you a clear overview of all...

[Meer zien]

Voorbeeld 4 van de 54  pagina's

  • Nee
  • Chapter 2, 3, 6, 8, 9, 11, 12, 13, 14, 15, 16
  • 30 januari 2021
  • 54
  • 2020/2021
  • Samenvatting
book image

Titel boek:

Auteur(s):

  • Uitgave:
  • ISBN:
  • Druk:
Alle documenten voor dit vak (4)

1  beoordeling

review-writer-avatar

Door: chantallyrosini • 3 jaar geleden

avatar-seller
dpchardeman
Definitions: underlined
Listing: in italics
Analysis or assumption: in bold
Summary Statistics

Null-hypothesis testing, statistical estimation, research ethics
Lecture 1 + tutorial 1 + Field 2, 3, 6 + Simmons et al. (2011)

 Statistical models:
o Variables: are measured constructs and vary across people in the sample
o Parameters: are constant relationships that we infer from our data. They are
bits and pieces of your model that allow you to represent the invariant
relationships in your data. They act on variables. We compute the model
parameters in the sample to estimate the value in the population.
 Normal distribution: two parameters  mean and SD
 Line: two parameters  The slope and intercept
 Several definitions:
o SS, s2, and SD all represent how well the mean fits the observed sample data.
Large values (relative to the scale of measurement) suggest the mean is a
poor fit of the observed scores, and small values suggest a good fit.
o Sums of Squares (SS)
 Residual sum of squares (SSR) = the degree of inaccuracy
when the best model is fitted to the data. Sum of squared
errors is a ‘total’ and is, therefore, affected by the number of data
points. Squared because residuals can be negative. Df are the number
of observations.
 Total sum of squares (SST) = squared difference between the observed
values and the predicted
 Model sum of squares (SSM) = Improvement in accuracy of the model
fitting to the data. If large, the linear model is very different from using
the mean to predict the outcome variable. Df are the number of
predictors.
 Used in R2 and F
o Variance (s2)
 the ‘average’ variability but units squared
o Standard deviation (SD/s)
 average variation but converted back to the original
units of measurement
 Tells us how much observations in our sample differ from the mean
value within our sample.
 The square root of the variance (s2)
o Standard Error (SE) (same as (SEx̅))
 For a given statistic (e.g. the mean) it tells us how much variability
there is in this statistic across samples from the same population.
 the width of the sampling distribution
 standard error for b tells us how different b would be across samples

, o Standard Error of the mean (SEx̅)
 the SD of sample means
 How well the sample mean represents the population
mean
 Central limit theorem: for sample of at least 30, the sampling
distribution of sample means is a normal distribution with mean (μ)
and standard deviation (σ).
 The more variability in the population, the larger the SE.
 The smaller your sample size, the larger the SE
o Mean sum of squares or MS = SS / df
 Makes SSR and SSM comparable
o P-value:
 indicates the probability of finding the current sample result or more
extreme when the null hypothesis is true.
 P-values depend on sample size
 Error
o Different ways to quantify this:
 Sum of Squared errors (SSE) = if we add the error for each person/data
point
 Mean Squared Error (MSE): When the model is the mean, de MSE is
called variance. The more cases, the larger your squared error. So you
divide it by df to get an average.
 MSE = SS / df.
 SD/s
 From sample to population
o Sample:
 Mean (x̅)
 SD (s)
o Population:
 Mean (μ)
 SD (σ)
o Using SD and sample size, we can determine how accurate our x̅ is compared
to μ.
o Sampling distribution: Frequency distribution of sample means from the same
population
 one sample will provide an estimate of the true population parameter
 If the range is large, we have less confidence
 It depends on the sample size + and variability (SD) of the trait in the
population
 Standard Error and CI
o 95% CI: for 95% of all possible samples the population mean (μ) will be within
its limits and 5% will not contain the population mean.
o Each time we conduct a significance test we take a 5% risk at rejecting H0
falsely.
o 95% CI calculated by assuming the t-distribution as representative of the
sampling distribution.

,  t-distribution looks like standard normal distribution, but fatter tails
depending on the df (here df = N - 1).
o Calculation of CI
 First calculate SE
 Calculate df (N – 1) and look up appropriate t-value
 LL: CI = x̅ − (t × SE)
 UL: CI = x̅ + (t × SE)
 95% of z-scores fall between -1.96 and 1.96, so if sample means a re
normally distributed with a mean of 0 and a SD of 1, the limits of the
CI would be -1.96 and 1.96. But for small samples the distribution is
not normal, but it has a t-distribution.
 So to construct for small samples a CI we use t instead of z.
o “margin of error” (t(df)×SE) of the mean is smaller in larger samples.
o When 0 is not in the CI, we can conclude that our result is significantly larger
than 0. H0 is rejected.
o The logic of CI is a way of explaining the sampling distribution, which is exactly
something we use in order to test things. You can also refrain this in terms of
the null hypothesis and alternative hypothesis, which is the same logic.
 Null hypothesis significance testing: NHST
o NHST evaluates the probability of getting a statistic at least as large as the one
you have given the H0.
o H0: There is no effect.
 H0: μ = 0
o H1/Ha: The experimental hypothesis
 H0: μ ≠ 0
o We cannot talk about the H0 or Ha being true, we can only speak in terms of
the probability of obtaining a particular result or statistic if, hypothetically
speaking, the H0 were true.
o The test statistic is significant (p < .05): we reject our null hypothesis when we
find our sample result unlikely, when H0 would be true.
o Any value outside the 95% CI has p < .05, any value inside the 95% has p > .05
o Different types of hypotheses require different test statistics:
 hypotheses concerning one or two means are tested with the t-test
 hypothesis concerning several means are tested with the F-test
o Look at critical values of the t-distribution: Calculate the t-statistic  compare
that observed t-value to the critical t-value The observed value is much
larger here than the critical t-value, so it’s very significant.
o When testing one sided the p-value must be divided by 2. It is however
advised against because of the increased risk at type I error.
 CI and NHST
o If you want to be more confident about the location of the population mean,
you have to cover a larger area (so a larger CI, like 99%).
o When our H0 concerns one population mean, (e.g. H0: μ = 0)
 NHST = one-sample t-test.
o When our H0 concerns the difference between two independent population
mean, (e.g. H0: μ1 – μ2 = 0),
 NHST = independent-samples t-test.

, o Three guidelines (Cummin & Finch) for the relationship between CI and NHST:
1. 95% CIs that just about touch end-to-end represent a p-value for testing
H0: µ1 = µ2 of approximately .01
 If two 95% CI’s of group means do not overlap, H0: µ1 = µ2 can
be rejected with p < .01. We say that it is highly unlikely that the
two means come from the same population.
 When an experimental manipulation is successful, we expect to
find that our samples have come from different populations. If the
manipulation is unsuccessful, then we expect to find that the
sample came from the same population.
2. If there is a gap between the upper limit of one 95% CI and the lower limit
of another then p < .01.
3. A p-value of .05 is represented by moderate overlap between the bars
(approximately half the value of the margin of error).
 When the two CIs would overlap more than half the (average)
margin of error (i.e. distance from the mean to the upper or lower
limit), we would not reject H0: µ1 = µ2.
 Effect Size
o p-values just tell you something about probability.
o Significance ≠ importance
 To quantify importance: effect size
o Effect size = magnitude of an observed effect
o An effect size is a standardized measure of the size of an effect:
 Standardized = comparable across studies
 Not (as) reliant on the sample size
o Cohen’s d: if we’re looking at differences between groups (and sometimes
within groups)
o Pearson’s r or R-squared: if we’re looking at continuous variables,
correlations. (Or between one continuous and a categorical variable
containing 2 categories)
o (partial) eta-squared: multiple variables in our analysis. It looks a lot like R
o Odds ratio: 2 or more categorical variables
o Rules of thumb for interpreting ES
 r = .1, d = .2 (small effect)
 the effect explains 1% of the total variance.
 r = .3, d = .5 (medium effect)
 the effect accounts for 9% of the total variance.
 r = .5, d = .8 (large effect)
 the effect accounts for 25% of the variance.
o Effect sizes are standardized based on the standard deviation, whereas test
statistics divide the raw effect by the standard error.
 Thus, small effects can be statistically significant as long as the sample
is large. As a consequence, statistically significant effects are not
always practically relevant.
 It is recommended to report p-values, CI’s and effect size, because the
three measures provide complementary information.
 Type I and type II errors

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper dpchardeman. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €6,99. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 67474 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen
€6,99  9x  verkocht
  • (1)
  Kopen