100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
Summary of Methods of Empirical Analysis (Modules 1, 2, 6 and 7) €4,99
In winkelwagen

Samenvatting

Summary of Methods of Empirical Analysis (Modules 1, 2, 6 and 7)

3 beoordelingen
 360 keer bekeken  15 keer verkocht

Summary of lectures including additional background explanations and graphs of Module 1: Regression analysis and using STATA (both lectures) Module 2: Time series analysis (both lectures) Module 6: Experimental methods (lecture 2; first lecture covered very basic information on experiments that has...

[Meer zien]
Laatste update van het document: 5 jaar geleden

Voorbeeld 3 van de 25  pagina's

  • 16 januari 2019
  • 16 januari 2019
  • 25
  • 2018/2019
  • Samenvatting
Alle documenten voor dit vak (1)

3  beoordelingen

review-writer-avatar

Door: anneliekeebbers • 5 jaar geleden

review-writer-avatar

Door: diablo • 5 jaar geleden

review-writer-avatar

Door: nicky112125 • 5 jaar geleden

avatar-seller
jordyhendrix1
Module 1 – Lecture 1
In ordinary least squares regression, it is assumed that the relationship between
variables is linear. A regression formula consists of a deterministic component
and the random error, also called the residual. Least squares refers to how the
model is estimated: the regression line is chosen in such a way that sum of the
squared vertical distance of each point to the line is as small as possible. The

n
least squares principle minimizes ∑ ε^ 2i = ^ε 21+ ε^ 22 +ε^ 2k.
i=1


Gauss-Markov theorem: if the frst fve OLS assumptions are satisfed, then the
least squares estimator is the Best Linear Unbiased Estimator (BLUE) of each lin-
ear combination of the observations. "Best" means that it is the parameter esti-
mate with the smallest variance; "unbiased" means that the expected value of
the parameter estimated by the model is equal to its population value. An unbi-
ased estimator with minimal variance is called efficient, and an estimator that
approaches the true population value as sample size increases is called consis-
tent. The OLS assumptions are:

1. all variables are measured at the interval level and without error,
 An error in Y is not problematic, because it's corrected for by the er-
ror term. An error in X is difficult to correct, and leads to an under-
estimation of the true population value of β in a bivariate model,
and often in multivariate models as well.
2. for each value of the independent variables, the mean value of the error
term = 0,
 For example: left hand fgure is good; right hand fgure is non-linear
or needs another predictor.




3. homoskedasticity: all random variables in the sequence have the same
variance,
 For example: variation in income is higher among older people 
heteroskedasticity.

,  Heteroskedastic estimates are still linear unbiased estimates, but
not "best". Standard errors of parameters are biased, and statistical
tests are thus not reliable.
4. no autocorrelation: cov ( e i , e j )=0  the error terms should be uncorrelated,
5. each independent variable is uncorrelated with the error term,
6. no multicollinearity: no independent variable should be perfectly (or ap-
proximately) linearly related to one or more of the other independent vari-
ables,
7. the conditional errors are normally distributed.

Note that the last two are not part of the BLUE criteria. The error terms should
graphically look like this, with a mean of zero, constant variance and normal dis-
tribution.




Heteroskedasticity, for example, looks like this. Note that the variance is not the
same for each value of X.

, These conditions are tested by residual analysis. A residual is the vertical dis-
tance of an observation to the regression line. Aims of residual analysis are:

1. global evaluation of the model: are important variables lacking? Is the re-
lationship between X and Y linear? Are predictors too strongly correlated?
2. examining individual cases, especially if N <500. Do specifc observations
ft badly? Do they inluence the estimation of the betas too much?
3. checking the trustworthiness of statistical test outcomes.

A scatter plot shows the association between two variables (in a single predictor
OLS model between X and Y). When there are several predictors, you could make
a scatter plot for each of them with Y, but then the efect of all other predictors
are not taken into account. A partial plot looks exactly like a scatter plot, but
plots the residuals from the predictor of interest against those of the remaining
variables.

Dealing with inluential cases
To fnd (overly) inluential observations, LEVER is a numerical instrument that
can be used. It measures the distance of the value of an individual observation
on a predictor to the center of the values of the other observations on that pre-
dictor, along with the inluence resulting from this distance. Values in the center
have a distance of ± 0, indicating no inluence. Higher values indicate a stronger

2∗p
inluence; critical value ¿ , where p = number of predictors and n = number
n
of observations. MAHAL is similar to LEVER.

Cook's distance D and DfFit both show the diference between betas estimated
with and without an individual observation. The larger this diference, the more
inluential the observation. Critical value of Cook's D ¿ 4 /n, critical value of DfFit

¿ 2∗√ p/n .

A good numerical instrument for the detection of inluential cases is SDRESID.
RESID is the estimated residual (e^ i ,absolute size of error); ZRESID is the stan-

dardized residual (e^ i /σ , relative size of error). Studentized residual (SRESID) is
like ZRESID, but with expected value 0 and variance 1. SDRESID is the studen-
tized deleted residual (does individual i ft well with all other individuals?).

Detecting and solving heteroskedasticity

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper jordyhendrix1. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €4,99. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 53249 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen
€4,99  15x  verkocht
  • (3)
In winkelwagen
Toegevoegd