4.4 Multivariate Data Analysis - Literature Summary
Summary Applied Data Analyses
All for this textbook (28)
Written for
Erasmus Universiteit Rotterdam (EUR)
Psychologie
Blok 4.4. Statistiek
All documents for this subject (2)
2
reviews
By: kempesridd • 5 year ago
By: reshma_r • 6 year ago
Translated by Google
uhmm why am I obliged to leave a review BEFORE I have read it anyway ??!
Seller
Follow
michellevanderstelt
Reviews received
Content preview
Field Samenvatting (2016)
Hoofdstuk 2 – Everything you never wanted to know about statistics
Inleiding
Een statistisch model is eigenlijk een model van hoe jij denkt dat de wereld eruit ziet. Als je een
sample met gegevens hebt, pas je hier dus een model op toe, om te kijken of jou sample overeenkomt
met de werkelijkheid. Welk model je gebruikt, hangt af van de parameter in het model en wat je
precies wilt weten. The degree to which a statistical model represents the data collected is known as
the fit of the model. Er wordt onderscheid gemaakt tussen een good, moderate en poor fit.
Populations and samples
Een populatie betreft alle personen die je wilt meten. Wil je bijv. naar lengte van mannen kijken, dan
bevat de populatie alle mannen in de wereld. Omdat dit onmogelijk is om te meten, gebruikt men
samples. Een sample is een kleine groep mensen, die de populatie vertegenwoordigd. Hoe groter een
sample, hoe beter het de werkelijkheid representeert.
Statistical models
Zoals in de inleiding gezegd, zijn er verschillende statistische modellen om een sample te testen. Het
lijkt soms erg moeilijk om deze te onderscheiden, maar het komt eigenlijk op hetzelfde neer:
Outcomei = (model) + errori
This equation just means that the data we observe can be predicted from the model we choose to fit to
the data plus some amount of error. The ‘model’ in the equation will vary depending on the design of
your study, the type of data you have and what it is you’re trying to achieve with your model.
Consequently, the model can also vary in its complexity. Er wordt onderscheid gemaakt tussen:
- Variabelen: variables are measured constructs that vary across entities in the sample.
- Parameters: parameters are estimated from the data (rather than being measured) and are
(usually) constants believed to represent some fundamental truth about the relations between
variables in the model. Some examples are: the mean and median (which estimate the centre of
the distribution) and the correlation and regression coefficients (which estimate the relationship
between two variables). Statisticians try to confuse you by giving different parameters different
symbols and letters (X for the mean, r for the correlation, b for regression coefficients) but it’s
much less confusing if we just use the letter b.
- Predicters: vaak wil je een uitkomst uit de variabele voorspellen. Een predictor geef je vaak aan
met de letter X
Mean as a statistical model
Outcomei = (b) + errori
Er zijn geen voorspellers bij dit model. Het gemiddelde voor een sample wordt aangegeven met X́ ,
het gemiddelde voor de populatie wordt aangegeven met . Er zijn verschillende manieren om te
kijken of het gemiddelde een een passend (fitting) model is. Een voorbeeld is de deviatie (ander woord
voor error) en de sum of squares (gekwadrateerde deviaties).
Manieren om te kijken of mean een good fit is:
- Deviatie (error): verschil tussen geobserveerde score en score volgens het model. In dit geval dus
werkelijke score – gemiddelde.
- Sum of Squared errors (SS): alle deviaties kwadrateren en bij elkaar optellen. Doe je om
negatieve scores (-) weg te halen en zorgen dat ze elkaar niet uit kunnen balanceren. Nadeel is
dat hoe meer variabelen je hebt (hoe groter n is), hoe groter SS wordt. Je kunt dit oplossen door
het gemiddelde van de SS te nemen.
- Mean Squared Error (MSE) of Variantie (s2): wordt berekent door SS/df. Variantie is de
benaming voor de MSE wanneer het model het gemiddelde is.
- Standaard Deviatie (s): De standaarddeviatie voor een sample wordt aangegeven met s, de
standaarddeviatie voor de populatie wordt aangegeven met . Tells us about how well the mean
represents the sample data. Hoe kleiner, hoe representatiever.
,Sampling variation
Wanneer je verschillende samples gebruikt, kan je niet verwachten dat steeds dezelfde parameters uit
deze sample komen. Met het gemiddelde als voorbeeld: niet steeds hetzelfde gemiddelde komt uit elke
sample. Als je alle gemiddelde van alle samples hebt, kun je een sample distribution maken. A
sampling distribution is the frequency distribution of sample means (or whatever parameter you’re
trying to estimate) from the same population. We can use the sampling distribution to tell us how
representative a sample is of the population. Dus: standaarddeviatie om te kijken hoe goed gemiddelde
sample representeert, het meetinstrument dat gebruikt wordt om te kijken hoe representatief een
sample voor de populatie is, is de:
- Standard Error of Means (SE): wordt berekend door s/√n (bij n > 30). Bij kleiner gebruik je vaak
t-distributie. A small standard error indicates that most sample means are similar to the population
mean and so our sample is likely to be an accurate reflection of the population.
Confidence intervals
We kunnen parameters gebruiken om grenzen te berekenen waarvan we geloven dat de parameter van
de populatie in zal vallen. Deze intervallen worden Confidence Intervals (CI) genoemd. The crucial
thing is to construct them in such a way that they tell us something useful. Therefore, we calculate
them so that they have certain properties: in particular they tell us the likelihood that they contain the
true value of the parameter we’re trying to estimate (in this case, the mean). Typically we look at 95%
CI’s, and sometimes 99% CI’s, but they all have a similar interpretation: they are limits constructed
such that for a certain percentage of samples (be that 95% or 99%) the true value of the population
parameter will fall within these limits. So, when you see a 95% CI for a mean, think of it like this: if
we’d collected 100 samples, calculated the mean and then calculated a CI for that mean then for 95 of
these samples, the CI’s we constructed would contain the true value of the mean in the population.
- Lower bound: X́ - (t x SE) of X́ – (z x SE)
- Upper bound: X́ + (t x SE) of X́ + (z x SE)
Bij z-scores kijk je eerst naar hoeveel procent. Als je 95% is het dus α = .05. Je kijkt dan in de tabel bij
.05/2 = .025 (1.96). Bij t-scores kijk je naar de df.
Tip:
Field zegt dat je voor samples kleiner dan n = 30 de t-distributie gebruikt en voor groter de z-scores.
Tutor en college zegt: gebruik onder n = 100 ALTIJD de t-distributie, vanaf 100 is t ook 1.96, dus dan
maakt het niet meer uit. Heb je n = 56, kies dan voor 50 in tabel, want dan neem je altijd het zekere
voor het onzekere. Als CI van 2 samples (bijv. mannen en vrouwen) overlappen, betekent dit dat er
geen sig. verschil is tussen de gemiddelden. Wanneer er geen overlapping is, betekent dit dat er wel
een sig. verschil is tussen de gemiddelde (of andere parameter die je meet).
Using statistical models to test research questions
Je kan dus een model gebruiken om te kijken of een voorspelling (hypothese) die je maakt, ook
daadwerklijk klopt. Dit wordt ook wel null hypothesis significance testing (NHST) genoemd. NHST is
the most commonly taught approach to testing research questions with statistical models. It arose out
of two different approaches to the problem of how to use data to test theories: (1) Ronald Fisher’s idea
of computing probabilities to evaluate evidence, and (2) Jerzy Neyman and Egon Pearson’s idea of
competing hypotheses.
Null hypothesis significance testing (NHST)
- Fisher’s P-value: ontwikkelde p-waarde om gokken uit te sluiten. Fisher’s basic point was that
you should calculate the probability of an event and evaluate this probability within the research
context. Although Fisher felt a p = .01 would be strong evidence to back up a hypothesis, and
perhaps a p = .20 would be weak evidence, he never said p = .05 was in any way a special number
- Types of hypothesis: in contrast to Fisher, Neyman and Pearson believed that scientific statements
should be split into testable hypotheses. The hypothesis or prediction from your theory would
normally be that an effect will be present. This hypothesis is called the alternative hypothesis and
is denoted by H1 . (It is sometimes also called the experimental hypothesis, but because this term
relates to a specific type of methodology it’s probably best to use ‘alternative hypothesis’.) There
is another type of hypothesis called the null hypothesis, which is denoted by H0 . This hypothesis
is the opposite of the alternative hypothesis and so usually states that an effect is absent.
- The basic principles of NHST: crudely put, this is the logic:
o We assume that the null hypothesis is true (i.e., there is no effect).
, o We fit a statistical model to our data that represents the alternative hypothesis and see how
well it fits (in terms of the variance it explains).
o To determine how well the model fits the data, we calculate the probability (called the p-
value) of getting that ‘model’ if the null hypothesis were true.
o If that probability is very small (the usual criterion is .05 or less) then we conclude that the
model fits the data well (i.e., explains a lot of the variation in scores) and we assume our
initial prediction is true: we gain confidence in the alternative hypothesis.
- Test statistics: een test statistic (t, F, χ2) is de verhouding tussen systematic variation (variation
that can be explained by the model that we’ve fitted to the data and, therefore, due to the
hypothesis that we’re testing, ook wel effect) en unsystematic variation (variation that cannot be
explained by the model that we’ve fitted. In other words, it is error,
or variation not attributable to the effect we’re investigating).
Voorbeeld: test statistics are the same as kittens in this respect: small ones are quite common and large
ones are rare. So, if we do some research (i.e., give birth to a kitten) and calculate a test statistic
(weigh the kitten) we can calculate the probability of obtaining a value (weight) at least that large. The
more variation our model explains compared to the variance it can’t explain, the bigger the test
statistic will be (i.e., the more the kitten weighs), and the more unlikely it is to occur by chance (like
our 150 g kitten). Like kittens, as test statistics get bigger the probability of them occurring becomes
smaller. If we use conventional NHST then when this probability falls below a certain value (usually p
< .05), we accept this as giving us enough confidence to assume that the test statistic is as large as it is
because our model explains a sufficient amount of variation to reflect what’s genuinely happening in
the real world (the population). The test statistic is said to be significant.
- One- and two-tailed tests: hypotheses can be directional (e.g., ‘the more someone reads this book,
the more they want to kill its author’) or non-directional (i.e., ‘reading more of this book could
increase or decrease the reader’s desire to kill its author’). A statistical model that tests a
directional hypothesis is called a one-tailed test, whereas one testing a non-directional hypothesis
is known as a two-tailed test. (Zie afbeelding onder).
- Type I and Type II errors: A Type I error
occurs when we believe that there is a
genuine effect in our population, when in fact
there isn’t. If we use the conventional
criterion then the probability of this error is .
05 (or 5%) when there is no effect in the
population – this value is known as the α-
level. Assuming there is no effect in our
population, if we replicated our data
collection 100 times we could expect that on
five occasions we would obtain a test statistic
large enough to make us think that there was
a genuine effect in the population even though there isn’t. The opposite is a Type II error, which
occurs when we believe that there is no effect in the population when, in reality, there is. This
would occur when we obtain a small test statistic (perhaps because there is a lot of natural
variation between our samples). In an ideal world, we want the probability of this error to be very
small (if there is an effect in the population then it’s important that we can detect it). Cohen
(1992) suggests that the maximum acceptable probability of a Type II error would be .2 (or 20%)
– this is called the β- level. That would mean that if we took 100 samples of data from a
population in which an effect exists, we would fail to detect that effect in 20 of those samples (so
we’d miss 1 in 5 genuine effects).
- Inflated error rates: laat het effect zien van het steeds maar herhalen van een statistische test. De
kans op geen type I fout is bij een significantie niveau van .05 95%. Als je dan een statische test
3 keer uitvoert, wordt dit (.95)3 = .857 85.7%. De kans op een type I fout is nu 9.3% i.p.v. 5%.
Dus: uitkijken met steeds herhalen van testen. This error rate across statistical tests conducted on
the same data is known as the familywise or experimentwise error rate. Oplossing: The most
popular (and easiest) way is to divide α by the number of comparisons, k: Therefore, if we
conduct 10 tests, we use .005 as our criterion for significance. In doing so, we ensure that the
cumulative Type I error remains below .05. This method is known as the Bonferroni correction.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller michellevanderstelt. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $8.58. You're not tied to anything after your purchase.