100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
Summary of Methods of Empirical Analysis (Modules 1, 2, 6 and 7) $5.36
Add to cart

Summary

Summary of Methods of Empirical Analysis (Modules 1, 2, 6 and 7)

3 reviews
 360 views  15 purchases
  • Course
  • Institution

Summary of lectures including additional background explanations and graphs of Module 1: Regression analysis and using STATA (both lectures) Module 2: Time series analysis (both lectures) Module 6: Experimental methods (lecture 2; first lecture covered very basic information on experiments that has...

[Show more]
Last document update: 5 year ago

Preview 3 out of 25  pages

  • January 16, 2019
  • January 16, 2019
  • 25
  • 2018/2019
  • Summary

3  reviews

review-writer-avatar

By: anneliekeebbers • 5 year ago

review-writer-avatar

By: diablo • 5 year ago

review-writer-avatar

By: nicky112125 • 5 year ago

avatar-seller
Module 1 – Lecture 1
In ordinary least squares regression, it is assumed that the relationship between
variables is linear. A regression formula consists of a deterministic component
and the random error, also called the residual. Least squares refers to how the
model is estimated: the regression line is chosen in such a way that sum of the
squared vertical distance of each point to the line is as small as possible. The

n
least squares principle minimizes ∑ ε^ 2i = ^ε 21+ ε^ 22 +ε^ 2k.
i=1


Gauss-Markov theorem: if the frst fve OLS assumptions are satisfed, then the
least squares estimator is the Best Linear Unbiased Estimator (BLUE) of each lin-
ear combination of the observations. "Best" means that it is the parameter esti-
mate with the smallest variance; "unbiased" means that the expected value of
the parameter estimated by the model is equal to its population value. An unbi-
ased estimator with minimal variance is called efficient, and an estimator that
approaches the true population value as sample size increases is called consis-
tent. The OLS assumptions are:

1. all variables are measured at the interval level and without error,
 An error in Y is not problematic, because it's corrected for by the er-
ror term. An error in X is difficult to correct, and leads to an under-
estimation of the true population value of β in a bivariate model,
and often in multivariate models as well.
2. for each value of the independent variables, the mean value of the error
term = 0,
 For example: left hand fgure is good; right hand fgure is non-linear
or needs another predictor.




3. homoskedasticity: all random variables in the sequence have the same
variance,
 For example: variation in income is higher among older people 
heteroskedasticity.

,  Heteroskedastic estimates are still linear unbiased estimates, but
not "best". Standard errors of parameters are biased, and statistical
tests are thus not reliable.
4. no autocorrelation: cov ( e i , e j )=0  the error terms should be uncorrelated,
5. each independent variable is uncorrelated with the error term,
6. no multicollinearity: no independent variable should be perfectly (or ap-
proximately) linearly related to one or more of the other independent vari-
ables,
7. the conditional errors are normally distributed.

Note that the last two are not part of the BLUE criteria. The error terms should
graphically look like this, with a mean of zero, constant variance and normal dis-
tribution.




Heteroskedasticity, for example, looks like this. Note that the variance is not the
same for each value of X.

, These conditions are tested by residual analysis. A residual is the vertical dis-
tance of an observation to the regression line. Aims of residual analysis are:

1. global evaluation of the model: are important variables lacking? Is the re-
lationship between X and Y linear? Are predictors too strongly correlated?
2. examining individual cases, especially if N <500. Do specifc observations
ft badly? Do they inluence the estimation of the betas too much?
3. checking the trustworthiness of statistical test outcomes.

A scatter plot shows the association between two variables (in a single predictor
OLS model between X and Y). When there are several predictors, you could make
a scatter plot for each of them with Y, but then the efect of all other predictors
are not taken into account. A partial plot looks exactly like a scatter plot, but
plots the residuals from the predictor of interest against those of the remaining
variables.

Dealing with inluential cases
To fnd (overly) inluential observations, LEVER is a numerical instrument that
can be used. It measures the distance of the value of an individual observation
on a predictor to the center of the values of the other observations on that pre-
dictor, along with the inluence resulting from this distance. Values in the center
have a distance of ± 0, indicating no inluence. Higher values indicate a stronger

2∗p
inluence; critical value ¿ , where p = number of predictors and n = number
n
of observations. MAHAL is similar to LEVER.

Cook's distance D and DfFit both show the diference between betas estimated
with and without an individual observation. The larger this diference, the more
inluential the observation. Critical value of Cook's D ¿ 4 /n, critical value of DfFit

¿ 2∗√ p/n .

A good numerical instrument for the detection of inluential cases is SDRESID.
RESID is the estimated residual (e^ i ,absolute size of error); ZRESID is the stan-

dardized residual (e^ i /σ , relative size of error). Studentized residual (SRESID) is
like ZRESID, but with expected value 0 and variance 1. SDRESID is the studen-
tized deleted residual (does individual i ft well with all other individuals?).

Detecting and solving heteroskedasticity

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller jordyhendrix1. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $5.36. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

53340 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$5.36  15x  sold
  • (3)
Add to cart
Added