This document is a short summary and overview of the most important topics covered in Statistics 2, including information from the web lectures and Q&As.
Since this is a summary, it may not include ALL the necessary information but this document is helpful for a revision of the most-discussed t...
WEEK 1 ~ Bivariate Linear Regression
1. Describe what a linear regression is.
- Describes the r.ship between variables by fitting a straight line.
+ B0 => constant (when X=0)
+ B1 => coefficient (when X increases 1)
- Interpretation: DV will decline by approximately *slope* scale points for every unit increase in IV.
- Assumptions: IV can be binary / nominal but DV should be interval ratio.
2. Describe what influences the size of standard errors.
- Bigger sample size smaller error
- More variation in X, smaller error
3. What is the “least squares” line
- Type of linear regression that minimizes the residuals.
4. Calculating predicted values.
- Expected when DV= # : Constant + (Slope * #)
- The difference btw expected values always equals the slope.
5. Assess the statistical significance of model coefficients
- Coefficient / Standard Error = SS
- If CI includes 0 => not significant
WEEK 2 ~ Multiple Linear Regression
1. Assessing “model fit” and its practical considerations
a. R squared
- Shows the amount of variance in Y explained by the model.
- 1 means it explains everything
- Adjusted R Squared corrects for the inflation that occurs when we add additional variables
- Regression / Total = R square
b. F statistics
- F statistics show the ratio of variance explained by the model to unexplained variance.
- Regression / Residual
- Only shows one coefficient is significant- but not which one.
- Higher F means better fit.
- When running a simple or bivariate linear regression F= t^2
2. Describing the concept of “multicollinearity”
- Multicollinearity is the degree of correlation between your IVs
- VIF= -R^2
- should be smaller than 5 ideal
- Tolerance: VIF/1 => should be above 0.2
- Potential solutions: combining into a single variable, collecting more data
3. Ordinal variables
- In ordinal variables, the r.ship between X and Y isn’t linear.
- Treating as categorical => pick a reference category
- Treating as continuous => we assume they’re equally spaced
4. Run and interpret a multiple linear regression in SPSS
- The value of the constant term is $. This represents the expected or mean value of the DV when the
IVs in the model equals 0. In this case, this represents the mean value of the DV among those who
, say… (the reference group). The values for the other coefficients are … We can see that ^ and * are
more supportive of …
1. Understand the differences between moderator and mediator variables
- Moderator: a variable that affects the direction/ strength of the r.ship btw. IV and DV.
- Confounding: a variable causes both the IV and DV
- Post treatment variables: (general term) variables that are a consequence
of the IV we care about and also has an influence on the DV. (are
endogenous) We care about X => Y, Z is a nuisance!
+ Mediator: explains the process through which two variables are
related. How much of X => Y goes through Z is a mediator!
- Exogenous: your level of education doesn’t cause your parents level of
education, but your parents level of education may affect your level of
education, thus affecting your views on migration. (potential causes- but not caused by)
- Heterogeneous effect: effect of X on Y varies by Z
- Homogenous effect: effect of X on Y doesn't vary by Z
2. Assess influential cases and potential outliers using SPSS
I. Outliers
- Standardized Residual
+ No cases above 3.29
+ < 1% cases above 2.58
+ < %5 cases above 1.96
II. Influential
- Cook’s Distance below 1
- Standardized DF beta below |-1|
+ All cases - that one case (then standardized) ~ overall
- Adjusted PV (good if it's close to 0)
+ PV for that case from a model in which the case is excluded ~ 1 particular coefficient
3. Create interaction terms out of existing variables
- When the coefficient of a variable is positive, and its interaction with another variable is negative:
effect starts out positive and then grows smaller with a one unit increase.
-
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller kaylasagiz. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $3.60. You're not tied to anything after your purchase.