Summary: Advanced Statistics
Index
Agresti chapter 9: Linear regression and correlation....................................................................................1
9.1 Linear Relationships...........................................................................................................................1
9.2 Least Squares Prediction Equation.....................................................................................................1
Allison Chapter 1: What Is Multiple Regression?........................................................................................2
Allison Chapter 2: How Do I Interpret Multiple Regression Results?..........................................................2
Allison Chapter 3: What Can Go Wrong With Multiple Regression?..........................................................3
Allison Chapter 5: How Does Bivariate Regression Work?.........................................................................4
Allison Chapter 6: What Are The Assumptions Of Multiple Regression?....................................................5
Allison Chapter 7: What Can Be Done About Multicollinearity?.................................................................6
Allison Chapter 8: How Can Multiple Regression Handle Nonlinear Relationships?..................................7
AGRESTI CHAPTER 9: LINEAR REGRESSION AND CORRELATION
9.1 LINEAR RELATIONSHIPS
x→ Explanatory variable y→ Response variable
Linear function → y = α +βx → Expresses observations on y as a linear function of observations on x. The
formula has a straight line graph with slope β (beta) and y-intercept α (alpha). In the context of a
regression analysis α and β are called regression coefficients.
9.2 LEAST SQUARES PREDICTION EQUATION
When a scatterplot suggests that the model y= α + βx may be appropriate, we use the data to estimate this
line. The notation ^y =a+bx represents a sample equation that estimates the linear model. The sample
equation is called the prediction equation, because it provides a prediction for the response variable at
every value of x.
The formulas to calculate a and b are:
∑( x−x )( y− y )
b= 2 and a= y−b x
∑( x−x)
When is an observation a regression outlier?
When it falls quite far from the trend that the rest of the data follow.
If it is influential; meaning that removing it results in a large change in the prediction equation
Unless the sample size is larger, an observation can have a strong influence on the slope, if its x-
value is low or high compared to the rest of the data
The prediction errors are called residuals, for an observation the difference between an observed value and
the predicted value of the response variable y− ^y , is called the residual.
We summarize the size of the residuals by the sum of their squared values. This quantity, denoted by SSE
(Sum of squared errors) is SSE=∑ ( y− ^y )2 .
The least squares estimates a and b are the values that provide the prediction equation for which the
residual sum of squares (SSE) is a minimum.
ALLISON CHAPTER 1: WHAT IS MULTIPLE REGRESSION?
Chapter highlights:
1. Multiples regression is used both for predicting outcomes and for investigating the causes of
outcomes
2. The most popular kind of regression is ordinary least squares but there are other, more
complicated regression methods
3. Ordinary multiple regression is called linear because it can be represented graphically by a
straight line
4. A linear relationship between two variables is usually described by two numbers, the slope and
the intercept.
5. Researchers typically assume that relationships are linear because it’s the simplest kind of
relationship and there’s usually no good reason to consider something more difficult.
6. To do a regression, you need more cases than variables, ideally lots more.
7. Ordinal variables are not well represented by linear regression equations. (Ordinal= An ordinal
variable is similar to a categorical variable. The difference between the two is that there is a clear
ordering of the variables. Ex. socio-economic status; low, middle, high)
8. Ordinary least squares chooses the regression coefficients (slopes and intercept) to minimize the
sum of squared prediction errors.
9. The R2 is the statistic most often used to measure how well the dependent variable can be
predicted from knowledge of the independent variables.
10. To evaluate the least squares estimates of the regression coefficients, we usually rely on
confidence intervals and hypothesis tests.
11. Multiples regression allows us to statistically control for measured variables, but this control is
never as good as a randomized experiment.
ALLISON CHAPTER 2: HOW DO I INTERPRET MULTIPLE REGRESSION RESULTS?
Chapter highlights:
1. Asterisks after a regression coefficient usually indicated that the coefficient is significantly
different from 0. The most common convention is one star for a p value below 0.05 and two stars
for a p value below 0.01. (This is not universal)
2. To interpret the numeric value of a regression coefficient, it’s essential to understand the metrics
of the dependent and independent variables
3. Coefficients for dummy (0,1) variables usually can be interpreted as differences in means on the
dependent variables for the two categories of the independent variable, controlling for other
variables in the regression model.
2
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper SociologyEconomics2. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €6,19. Je zit daarna nergens aan vast.