100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
Summary AMDA Spring Chapter 4: Missing Data €3,99   In winkelwagen

Samenvatting

Summary AMDA Spring Chapter 4: Missing Data

 8 keer bekeken  0 keer verkocht

AMDA Spring Chapter 4: Missing Data

Voorbeeld 2 van de 9  pagina's

  • 21 maart 2023
  • 9
  • 2020/2021
  • Samenvatting
Alle documenten voor dit vak (10)
avatar-seller
fionabrosig
81

4. Missing Data .1
Note: if you care about the exact equations, then you should look them up on the slides, because
you cannot really get them accurately with a normal computer on word. But they should also not be
very important since the exams are open book and about understanding, rather than quoting exact
formulas for calculating anything (that’s what we have computers for, right?)

1. Everyone will have missing data problems
2. Missing data problems are the heart of statistics

Causes of missing data
● There can be all kinds of reasons why you have missing data. E.g.:
● Respondent skipped the item
● Data transmission/coding error
● Drop out in longitudinal research
● Refusal to cooperate
● .. and so on

Consequences of missing data
● If you have less data than planned, statistical power problems might arise
● There might be biases in the data analysis, such as:
○ Effect bias
○ Representativity
○ Appropriate confidence interval, p-value?

Response indicator
Random variable Y with missing data (e.g. body weight)
Random variable X contains complete covariates (e.g. age)
Response indicator
● R = 1 if Y is observed
● R = 0 if Y is missing

● R is always complete!
● Using the response indicator, we might be able to tell a missing data mechanism (see next)

Missing data mechanisms
There are three different ways/categories, in which missing data can be separated: MCAR, MAR,
NMAR. They each have their own consequences. They will be more elaborated in the following.

MCAR
● Missing Completely at Random
● Probability to be missing is not related to any factor, aka it is completely random
● P(R=0|Y,X) = P(R=0) → the chance to be missing does not depend on any specific thing
● Example: respondent accidentally skipped question.

MAR
● Missing at Random
● Probability to be missing depends on known factors

, 82
● P(R=0|Y,X) = P(R=0|X) → the chance to be missing depends on a variable, that we are
also measuring in our data (therefore, we can account for it)
● Example: Gender always observed, and men have more missing data than women

MNAR
● Not Missing at Random
● Probability to be missing depends on unknown factors. So a factor that we do not include in
our data and therefore cannot take into consideration/count for, we do not know how the
data is missing
● P(R|Y,X) does not simplify
● Example: People with high incomes have more missing data on a variable measuring
income than people with lower incomes.

Ignorable vs not ignorable missing data
● MAR (and within that MCAR) can be rather ignored, but NMAR cannot be ignored.
● MCAR test: tests H0 that data are MCAR. However, if significant it remains unknown
whether data are NMAR or MAR
○ Usually you treat missing data as MAR, because it requires the least assumptions
and is still testable.
○ You can see whether data is missing with other variables (by seeing whether they
are dependent on each other). But it can also still be that those data points are
missing because of other variables that are not in the data set, or that those are
confounded by other variables.

Strategies to deal with missing data
There are different ways to deal with missing data: Prevention, simple methods, Likelihood
methods (EM), and multiple imputation. Each will be discussed in the following.

Prevention
● Prevention is always the best. For example in Qualtrics, make it a forced response so
people HAVE to answer before they are able to continue. That way you make sure you do
not have missing data etc. and therefore do not have to deal with it later on.

Simple methods
● Listwise deletion - complete-case analysis: as soon as someone is missing one datapoint,
they are not being included in the whole analysis
○ Advantages: Simple (default in SPSS), Correct standard errors, significance levels,
Works in some special NMAR cases (Little, 1993; Vach 1994)
○ Disadvantages: Wasteful, Same data - different n, OK under MCAR, biased under
MAR and partly NMAR
● Pairwise deletion - available case analysis: you only take out where there is actually
information missing, you still use the rest of the data
○ Advantages: Uses all available information
○ Disadvantages: Only works under MCAR, Computational problems: Negative
variances, rank problems
○ AVOID !
● Mean substitution - you substitute the missing data-points with the mean of the sample
○ Avoid!
○ Biased under MAR, underestimates the variance, disturbs the distribution

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper fionabrosig. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €3,99. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 78252 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen
€3,99
  • (0)
  Kopen