Samenvatting

Summary Cheatsheet AMDA Spring

7 keer bekeken 0 keer verkocht

Vak
AMDA Spring (6465G06SY)

Instelling
Universiteit Leiden (UL)

A clear cheatsheet including each topic of AMDA Spring. I use color codes so everything is easy to find. Left some space for personal notes. On the other page (which is not visible right now), the remaining topics are covered!!

[Meer zien]

Voorbeeld 1 van de 2 pagina's

Bekijk voorbeeld

Geupload op 9 december 2024
Aantal pagina's 2
Geschreven in 2024/2025
Type Samenvatting

spiekbrief
cheatsheet
amda spring
applied multivariate data analysis spring

Complete & partial mediation:
#3 Clustered bootstrap: Accounts for data dependency among observations within clusters. How: 1) Draw sample from the original data. This time sample
clusters with replacement (instead of individual observations). Now you will get the same number of clusters (level 2 units) as in the original sample. The
dependency among measurements is present and reflected in commutations. Perform a linear regression to each sample & estimate regression coefficients.
Advantages: Dependency handling (accurate inference of hierarch data), can be used with other data (binary, multinomial, categorical). Due to long-matrix
format: Can handle missing data, unbalanced data, and can work with time-varying covariates.

Modeling of long-longitudinal clustered data: Now you do the same, but instead of looking between persons, you look between groups/wards. As in
example: 1. Exploration: Plot different predictors against Y (for all wards, for wards separate, for ward type). Perform LR for each ward type predicting Y based
on X. Overall: Check for linearity, patterns, etc. 2. Build regression models for each ward (level2) unit. You do this now per “group” instead of each person (in
longitudinal data).3. Multi-level modelling: Start with the unconditional means model (intercepts only): ICC and within (res)-and between (int) variance. 2.
Add random intercepts of predictor to the model: Compare the variance and estimates > Compare using LRT > Make model with both random slopes and
intercepts. Look at the correlation/covariance: When close to 1.0 or -1.0 this is strange (this would mean perfect correlation). This indicates the model is
probably overfitted: Too many random effects or too many complex relationships. Look at the estimates and their t-values. Interpreting correlation column:
1st column = correlation with first term (intercept), 2 nd column = correlation with second term (predictor), 3 rd column = correlation with other predictor.
Solution: Adopt the model with a stepwise approach, were you add one random effect (slope) at a time and see if the model improves. > Compare these
models with the random-intercept-only model using LRT. Finalizing: When you found the best model: Make it even better by removing insignificant predictors.
Sobel Test mediation
Adjusting the p-value when models only differ in random effect: Problem with LRT: The variance never becomes negative. The chi-distribution assumes
symmetric distribution, which may be inappropriate: test skewed (higher p-values) because it cannot take negative values (harder to detect significant effects).
Needs correction! Note: There is a difference between the amount of random effects and the amount of parameters: #random effects: Number of random
intercepts and slopes. #parameters: Variance of random intercepts, variance of random slopes, covariance. The estimates obtained from the model about
variation or relation between variables. Calculate for: X random effects vs. X+1 random effects: Take the mean of: 1 – (chi-sq (value, X + 1) and 1 - chi-sq
(value x). It is not a good idea to add too many random effects simultaneously: Than you don’t know how to correct the p-value.

Missing data: Can have significant consequences on statistical analysis and its conclusion. Consequences: 1) Fewer data than planned > less power, more type-
2 errors (failing to reject false H0). 2) Biases: Effect bias (relations inaccurately estimated), representativity (of the population), wrong CI, and p-values.
Response indicator (R): Denotes if the value is observed (1) or missing (0). Missing data mechanisms: 1) MCAR: Missing completely at random.: P (R=0 |Y, X) =
P (R=0). Missingness is unrelated to the observed and unobserved data (= a strict assumption). 2) MAR: Missing at random: P (R=0 | Y,X) = P (R=0|X). The
missingness is related to the observed data, not the unobserved data. E.g., Gender is observed, men have more missing data. 3) NMAR = Not missing at
random: Missingness is related to the unobserved data. E.g., people with high incomes have more missing data on variables measuring income than people
with lower incomes. Types of NMAR missing data are non-ignorable and introduce bias if not properly addressed. How to know: See if there are patterns in
missing data. Strategies to deal with missing data: #Simple methods: 1. Listwise deletion (exclude missing values from analysis): Simple, correct SE and p-
values. But wasteful, nonrepresentative, and biased under MAR and NMAR. 2. Pairwise deletion: Instead of excluding entire cases, include observations as
long as the required variables are present. Less waste. But: only works under MCAR, computational problems, avoid! 3. Mean substitution: Replacing missing
values with a mean of observed values. Avoid: biased under MAR: Mean does not represent distribution, disturbs distribution. Predictive mean matching
(PMM) works also with non-linear data (or skewed data like in practical). How: Fitting regression model using observed data > predict values: find a set of
observed values (donors) whose predicted values are the closest to the predicted value of the missing data point (multiple). > impute missing values > pool
results. It helps to keep the distribution of variables. Extra: Practice exam:

Multiple imputation is flexible (can handle different data: catg/cont), has a small bias, and repairs problems under extreme MAR. The fully conditional
specification (FCS) method is as follows: 1) Define the imputation model using relevant variables for each missing variable. 2) Fill in missing values with some
initial guesses. 3) Start iterating at one variable. Use an updated model to predict values for the current variable and replace guesses. Do this for every missing
until the imputations stabilize. M = Number of imputed datasets. Compute analysis on all the sets: Pooled point estimate Q =(average of all estimates). Pooled
variance estimate (Within-imputation variance) U: square each variance (SE) of estimates and average them. Between-imputation variance (B): B = (estimate
1– Q)^2 + (estimate 2 – Q)^2 + etc. /(m-1). Total variance (T) = U + (1 + 1.3)B.  √ T. CI-95% = Q ±1.96 * √ T . Advantages: Correct analysis
(considering uncertainty), general method. Disadvantages: More work, pooling step. From PRAC: Interpreting imputation plots: Plots the mean and SD of the
imputed variables against the iterations. See how algorithm works and estimates datapoints. Every colored line is a different imputed dataset. They should
converge to one stable level. + Look at scatterplots (Y-X) to see if the missing data is in the same range/pattern as observed data. Determine: 1) The effect of
separate predictors (LR, pooled). 2) The overall model significance (F, Wald test). 3) Specific effects of variables (and interaction). Meth = method used for
imputation (pmm, polyreg, norm). Pred = Prediction matrix: Indicates which variable is predated by which other variable. Limitations pooling techniques: Not
applicable to multivariate tests (MANOVA, effectsizes, X2-tests), but only for single parameter tests. Solution: Multiple Imputation Pooling: Pools a set of
parameter estimates. Not one SE anymore: Covariance matrix from which SEs can be derived > pooled covariance matrix from M-imputed datasets > Pooled F-

vale. Definitions: M= number imputed datasets, Qm = parameter estimate, Q(stripe)= pooled
estimate of parameter, Um = variance of the parameter estimate Qm, U(stripe)= within-imputation variance (average of variances from all imputed datasets),
B=between subject variance. Applications: #1: F-test for the significance of R 2 in LR: Set the number of imputations > impute datasets> LR for each dataset >
pool results > estimates used in the covariance matrix in the Q bar > combined to make total covariance matrix (T).  Fill in formula that gives overall F-test
(df = amount of M?) Interpreting F-test for regression models: Ho: Regression coefficient = 0 (no explanatory power), Ha: Reg coef ≠ 0 (does have expl. Power).
#2: F-test for significance of the change in R2 of large vs smaller model: The same as before, but now with 2 regression weights (new added): df =2. When
insig: not improving the previous model. #3: F-test of variance: Compare Y between groups (wit more datasets): Formulate an ANOVA as a regression model.
How: Decode categorical variables as numerical variables with dummy coding. Classical dummy coding: Variable = 1 when condition is true (0 if not). We take
k-1 dummy variables for k categories (to avoid multicollinearity). The last category is the reference category and does not get own variable. Example: X1=1
Greenham, 0 oth, X2=1 Newbay, 0 other, Ref = X1& X2=0. Problem: When pooling results of ANOVA with regression dummy coding, F-test has different
meanings: In dummy reg: Constant B0 equals the mean of reference category (effects = difference from mean of reference). Classic ANOVA: Constant equals
the overall mean of dataset (effects = difference from overall mean). Solution: Effect coding: The reference group is indicated by -1. Example: X1 = 1
Greenham, X=-1 ref (Newbay), X=0 otherwise (Wheaton), X2 = 1 Wheaton, X2=-1 Newbay, X=0 oterwise (Greenham). Now the regression represents the overall
mean, just as ANOVA!

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper Psychologystudent2001. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €5,96. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 50064 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen

Populaire Universiteiten

Populaire Hogescholen

Populaire Scholen

Populaire samengevatte studieboeken voor Communicatie en Taal

Populaire samengevatte studieboeken voor Economie en Bedrijf

Populaire samengevatte studieboeken voor Exact en Informatica

Populaire samengevatte studieboeken voor Gedrag en Maatschappij

Populaire samengevatte studieboeken voor Gezondheid en Geneeskunde

Populaire samengevatte studieboeken voor Onderwijs en Opvoeding

Populaire samengevatte studieboeken voor Recht en Bestuur

De beste samenvattingen om je Wft-diploma te behalen

De beste samenvattingen om je theorie examens te behalen

De beste samenvattingen voor je cursus in de Veiligheidsbranche

De beste samenvattingen voor Gezondheid & Hygiëne cursussen

De beste samenvattingen voor zakelijke cursussen

De beste samenvattingen voor je PABO WisCAT cursus

Populaire vakken

Populaire vakken

Populaire vakken

Boekverslagen en samenvattingen

Verkoper

Samenvatting

Summary Cheatsheet AMDA Spring

Document informatie

Onderwerpen

Geschreven voor

Verkoper

Voorbeeld van de inhoud

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Snel en makkelijk kopen

Focus op de essentie

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?