EDUSPRED.COM
EC226 ECONOMETRICS
(OFFERED BY UNIVERSITY OF WARWICK)
SUMMARY OF HANDOUT 1:
TWO VARIABLES LINEAR REGRESSION ANALYSIS
NOTE: Eduspred is not sponsored or endorsed by any college or university
WWW.EDUSPRED.COM MAIL: ADMIN@EDUSPRED.COM, WHATSAPP: +91-9560560080
, LIST OF TOPICS COVERED
(QUIZ INCLUDED):
• CORRELATION VERSUS REGRESSION ANALYSIS
• REGRESSION
• CLASSICAL LINEAR REGRESSION MODEL (CLRM)
ASSUMPTIONS
• ESTIMATING THE POPULATION PARAMETERS
• PROPERTIES OF OLS ESTIMATORS
• HYPOTHESIS TESTING (5-Step Procedure)
• MEASURE OF GOODNESS OF FIT
• INTERPRETING COEFFICIENTS
ADDITIONAL RESOURCES AND SUPPORT:
• TEST YOUR KNOWLEDGE - ACCESS THE ONLINE QUIZ
• STRUGGLING WITH ECONOMETRICS? SCHEDULE A
FREE DISCUSSION CALL
WWW.EDUSPRED.COM MAIL: ADMIN@EDUSPRED.COM, WHATSAPP: +91-9560560080
, HANDOUT 1: TWO VARIABLES LINEAR REGRESSION ANALYSIS
CORRELATION VERSUS REGRESSION ANALYSIS
Correlation and Covariance: Measures of LINEAR ASSOCIATION
∑𝑛 ̅)
𝑖=1(𝑥𝑖 −𝑥̅ )(𝑦𝑖 −𝑦
𝐶𝑜𝑣(𝑥, 𝑦) = 𝑛−1
If 𝐶𝑜𝑣(𝑥, 𝑦) = 0, then there is no linear relationship between x and y.
If 𝐶𝑜𝑣(𝑥, 𝑦) > 0, then there is a positive linear relationship between x and y.
If 𝐶𝑜𝑣(𝑥, 𝑦) < 0, then there is a negative linear relationship between x and y.
Drawbacks of Covariance:
1) It’s not a scale free measure.
Example: Let x be height (in inches) and y be weight (in kilograms)
Assume Cov (Height, Weight) = +2000
If you decide to measure y (i.e. weight) in grams instead of kilograms, then the covariance between height and
weight will change.
2) Tells the direction of the linear relationship (not the strength)
Continuing with the previous example, Cov (Height, Weight) = +2000, indicates a positive linear relationship
(as covariance > 0). However, the number, 2000, doesn’t tell us anything about the strength of the linear
relationship. We don’t know whether the linear relationship is strongly positive, mildly positive or weakly
positive.
Solution?
Switch to Correlation. Think of correlation as a modified version of covariance.
𝐶𝑜𝑣(𝑥,𝑦)
𝐶𝑜𝑟𝑟(𝑥, 𝑦) = 𝜌(𝑥, 𝑦) =
√𝑉(𝑥).𝑉(𝑦)
Correlation takes care of the drawbacks of the covariance.
1) Correlation is a scale free measure
Example: Let x be height (in inches) and y be weight (in kilograms)
Assume Corr (Height, Weight) = + 0.75
If you decide to measure y (i.e. weight) in grams instead of kilograms, then the correlation between height
and weight won’t change.
2) Tells the direction as well as the strength of the linear relationship
Correlation takes value between -1 and +1.
If 𝐶𝑜𝑟𝑟(𝑥, 𝑦) = 0, then there is no linear relationship between x and y.
If 𝐶𝑜𝑟𝑟(𝑥, 𝑦) > 0, then there is a positive linear relationship between x and y.
WWW.EDUSPRED.COM MAIL: ADMIN@EDUSPRED.COM, WHATSAPP: +91-9560560080
, If 𝐶𝑜𝑟𝑟(𝑥, 𝑦) < 0, then there is a negative linear relationship between x and y.
If 𝐶𝑜𝑟𝑟(𝑥, 𝑦) = + 1, then there is a perfect positive linear relationship between x and y.
If 𝐶𝑜𝑟𝑟(𝑥, 𝑦) = − 1, then there is a perfect negative linear relationship between x and y.
The closer the correlation to + 1, the stronger the positive linear relationship.
The closer the correlation to - 1, the stronger the negative linear relationship.
Note: Covariance and Correlation are measures of linear association. If the covariance (and correlation)
between two variables is 0, then it doesn’t imply that the variables are independent. They may still have some
non-linear relationship.
Why switch to regression from correlation (covariance)?
Correlation doesn’t show us the cause and effect relationship. If Correlation (x,y) = + 0.9, then it implies that x
and y have a strong positive linear relationship i.e. both of them move together in the same direction. We
won’t be able to tell if x causes y or if y causes x or if both of them are just moving together without any direct
relationship.
REGRESSION
Looks at the linear causal association between the random variables (Causal is the key word here)
To keep it simple, think of 2 variables for now (we can also have more than 2 variables)
y – Dependent Variable or Endogenous Variable or Regressand
x – Independent Variable or Exogenous Variable or Explanatory Variable or Regressor
Population Equations: Sample Equations:
𝐸(𝑦|𝑥) = 𝛼 + 𝛽𝑥𝑖 𝑦̂𝑖 = 𝑎 + 𝑏𝑥𝑖
𝑦𝑖 = 𝐸(𝑦|𝑥) + 𝜀𝑖 𝑦𝑖 = 𝑦̂𝑖 + 𝑒𝑖
𝑦𝑖 = 𝛼 + 𝛽𝑥𝑖 + 𝜀𝑖 𝑦𝑖 = 𝑎 + 𝑏𝑥𝑖 + 𝑒𝑖
𝛼 : Population Intercept Parameter (Unknown but 𝑎 : Estimator of 𝛼
constant)
𝛽 : Population Slope Parameter (Unknown but 𝑏 : Estimator of 𝛽
constant)
𝜀𝑖 : Random error term (for population) or 𝑒𝑖 : Random error term (for sample) or
Disturbance Term Residuals
WWW.EDUSPRED.COM MAIL: ADMIN@EDUSPRED.COM, WHATSAPP: +91-9560560080