100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
Econometrics for E&BE midterm summary. Chapters 1/2/3/4/5.1/6.2, tutorials and lectures $8.95   Add to cart

Summary

Econometrics for E&BE midterm summary. Chapters 1/2/3/4/5.1/6.2, tutorials and lectures

 196 views  2 purchases
  • Course
  • Institution
  • Book

YOU CAN ALSO BUY IT DIRECTLY FROM ME FOR €5 Elaborate explanations and formulas, theory, examples, applications. Summarises both the book and the lectures, and also puts into practice all subjects covered in the exercises and stata tutorials.

Preview 3 out of 30  pages

  • Unknown
  • February 23, 2019
  • 30
  • 2019/2020
  • Summary
avatar-seller
Econometrics chapter 1 - The nature of econometrics and economic data
Mathematical statistics focuses on experimental economic data. Econometrics on non-
experimental data so not accumulated through controlled experiments on individuals, firms or
segments. This is observational or retrospective data, passively collected by the researcher.
Experimental data are often collected in laboratory environments and difficult to obtain in social
sciences.

Empirical analysis uses data tot test a theory or estimate a relationship. A formal economic model
can be constructed consisting of mathematical equations describing various relationships. For
example an economic model of crime where the dependent variable y depends on the
independent variables xi. As is common in economic theory, we do not know specifically the
function f which depends on an underlying utility function that’s rarely
known. yet we can predict the effect each variable would have on criminal
activity.

The determinants of criminal behaviour are reasonable based on common
sense. We can use the same intuition instead of formal economic theory to
realise that factors as education, experience and training affect productivity
which in turn determines wage. Therefore we can make a simple model
where wage = f (educ, exper, training). This is the economic model. We now
want to turn it into an econometric model by specifying the function f and
deal with variables that can’t be reasonably observed for a given individual. Instead we rely on
averages and statistics to derive an approximate variable.

The ambiguities inherent in the
economic model of crime are
resolved by specifying a
particular econometric model,
shown here on the right.

The term u contains
unobserved factors: u is
random error or disturbance
term summarising all
unobserved factors. We can
never estimate u entirely.

The constants ß0 to ß6 are parameters of the econometric model and describe the directions and
strengths of the relationship between crime and the factors used to determine crime. We’d expect
ß1wage to be negative as a higher wage would result in less crime intuitively. ß are to be estimated
from sample data.

After making (1) making an economic model and (2) making an econometric model with
unspecified parameters we (3) state an hypotheses based for the unknown parameters. An
empirical analysis requires data. We (4) collect data, (5) apply econometric methods to estimate
parameters and (6) use estimated parameters to make predictions on e.g. economic theory.

Structure of economic data
A cross-sectional data set consists of a sample taken at a given point in time. We assume they’ve
been obtained by random sampling from the underlying population, yet this random sampling
may be violated. Cross-sectional data is e.g. of 500 workers in 1976, with variables wage,
experience, education etc.

Time series data sets consists of observations on one or more variables over time, e.g. GDP and
CPI for the years 1950 to 2018. Most economic and other time series are related over time, as
GDP in one year is strongly related to GDP in another. Data frequency matters as many economic
time series display strong seasonal patterns, as monthly data showing housing prices depends on
weather among other things.
Pooled cross sections have both cross-sectional and time series features. Suppose a yearly
cross-sectional survey combined over several years.

,Panel data or longitudinal data sets consist of a time series for each cross-sectional member in
the data set. We observe a set of variables for a set of individuals over a certain time period. The
difference between panel data and pooled cross sections is that the panel data follows the same
cross-sectional units over a given time period.

Cross sectional data gives us independent random variables with a common probability density
function where each observation is an independent identically distributed (idd) random variable.

Causality and ceteris paribus
Just because ß > 0 doesn’t mean that x causes y, or that there’s causality.
Correlation is a dependence or association between two random variables, but correlation doesn’t
equal causation.
Ceteris paribus means holding other (relevant) factors equal an matters for causal analysis: If we
don’t hold other factors fixed such as income we can’t know the causal effect of, say, price
change on quantity demanded. If we succeed in holding all other relevant factors fixed and then
find a link we can conclude a causal effect.
Though ceteris paribus is an important assumption for causality it’s usually impossible to hold
everything else equal, so we put aim for holding enough: Econometrics methods can simulate a
ceteris paribus experiment.

When examining the effects of education on wage we have endogeneity: People choose their
education level so it’s not independent of other factors such as intelligence. When born with
higher intelligence, you’re more likely to choose to go to university, but would you not have
already earned more wages even without education if you’re more intelligent?

Chapter 2 - The simple linear regression model
Often we’re interested in explaining y in terms of x, or studying how y varies with changes in x. We
must allow for other factors than just x to affect y, must determine the functional relationship
between x and y, and ensure that we are capturing a citrus paribus relationship. A simple
relationship would be: y = ß0 + ß1x + u.
This is the simple linear regression model or the two-variable linear regression model. y is the
dependent/explained/left hand side variable, whereas x is the independent/explanatory/RHS
variable.
ß0 is the intercept parameter, a constant term. ß1 is the slope parameter, of main interest. u is the
error term or disturbance in the relationship, representing factors other than x affecting y; u stands
for unobserved.
If other factors in u are held fixed the change in u is zero: If ∆u = 0, x has a linear effect on y.
∆y = ß1∆x if ∆u = 0. So y changes by the parameter ß1 times x if the change in unobserved
variables is zero, meaning then x has a linear effect on y.
ß1 is the slope parameter in the relationship between y and x, holding other factors in u fixed. The
intercept parameter ß0 is a constant term but of less interest than the sole parameter.
If ∆u = 0 then ∆y = ß1∆x so ß1 = ∆y/∆x, meaning the slope parameter measures the change of y
for a change of 1 in x only if the change in unobserved variable is zero.

The linearity of our equation implies constant returns, while for education for example we might
expect increasing returns (a master brings more marginal wage than a one extra year of bachelor).
For now though we focus on whether the linear model really allows us to draw ceteris paribus
conclusions: How can we learn in genera about the ceteris paribus effects of x on y holding all
other factors fixed?

U includes omitted variables (the unobserved ones), measurement errors in y and x, non-non-
linearities in the relation between x and y, and unpredictable or random effects.

As long as the intercept ß0 is included in the equation we don’t lose anything by assuming the
average value of u in the population is zero. E(u) = 0 is an assumption we can make, and it says
nothing about the relationship between u and x, we can assume the average u is zero safely.

, The correlation coefficient is a natural measure of the association between two random variables,
which is useful for u and x. If u and x are uncorrelated they’re not linearly related, but correlation
only measures linear dependence so we can’t rule out all relations between x and u.

Our assumption is that the average value of u does not depend on the value of x: E (u|x) = E (u).

So the average value of the unobservables is the same across all values of x, given a value of x
we expect the value of u to be zero. If E (u|x) = E (u) holds we know the covariance (x, u) = 0.
When it holds, u is the mean independent of x. Assuming u is zero, and assuming the average
value of x does not depend on u, we obtain the zero conditional mean assumption: E (u|x) = 0.

Only if the zero conditional mean assumption holds do we have causal interoperation of x on y.

For a function of wage as a function of education, assume u is the innate ability a person
possesses regardless of education. Equation E (u|x) = E (u) requires that the average level of
ability is the same regardless of years of education: E(u | 8) must equal E(u | 16), so the average
ability of people with 8 years of education must equal those with 16 years. If we think average
ability increases with years of education, we can’t assume ceteris paribus and the equation E (u|x)
= E (u) = 0 doesn’t hold.
The zero conditional mean assumption gives ß1 a useful interpretation: Assuming E(u|x) = 0 gives
us E(y|x) = ß0 + ß1x since we can leave out u if we assume the mean is zero. This shows
population regression function PRF E(y|x) is a linear function x. This linearity means a one-unit
increase in x changes the expected value of y by ß1.

Given the zero conditional mean assumption E (u|x) = 0 we can view the equation y = ß0 + ß1x + u
in two parts. ß0 + ß1x is the systematic part of y: The part of y explained by x.
The unsystematic part is u, which is the part of y not explained by x.

Deriving the ordinary least squares estimates
Now we must estimate the slope parameter and intercept parameter ß, for which we use a sample
from the population. A random sample size n from the population gives yi = ß0 + ß1xi + ui where ui is
the error term for observation i since it contains all factors affecting yi other than xi.
We use this data to estimate the slope and intercept.
Since u is uncorrelated with x we see that u has zero expected value and the covariance between
x and u is zero: E(u) = 0 and cov(x, u) = E(xu) = 0.
Since u = y - ß0 - ß1x we can assume E(y - ß0 - ß1x) = 0. We can use this information to obtain
good estimators of ß0 and ß1 given a sample of data: We choose ß0 and ß1 as the estimates of the
unknown population parameter ß0. We predict the fitted value yi = ß0 + ß1xi.

The estimates for ß are called the ordinary least squares OLS estimates. The difference between
the predicted y and its fitted value is the residual for each observation i. The sum of squared
residuals is what we want to keep as small as
possible, to keep both positive and negative
errors to a minimum. We thus want to
minimise the formula shown here.

To do this we take the first order conditions for
the OLS estimates to minimise. We take
derivatives with respect to ß0.
The derivative w.r.t. ß0 is shown above, w.r.t ß1 is shown here.
This gives us the two first order conditions, which can be
solved for ß’s.

Using a dataset of 209 ceo’s, their salary and ROE of their
companies we can use the Ordinary Least Squares method to
find the regression
line on the left,
which relates salary to ROE. Salary hat indicates that
it’s an estimated equation.

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller bramdelange. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $8.95. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

83430 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$8.95  2x  sold
  • (0)
  Add to cart