,INHOUDSOPGAVE
TOPIC 1. INTRODUCTION AND DEALING WITH DATA .................................................................. 1
1.1. INTRODUCTION .......................................................................................................................1
1.2. TYPES OF FINANCIAL DATA .....................................................................................................3
1.3. OBTAINING DATA ....................................................................................................................5
1.4. DEALING WITH DATA ..............................................................................................................6
1.4.1. DATA TRANSFORMATIONS ...............................................................................................6
1.4.2. GRAPHICAL METHODS ................................................................................................... 11
1.4.3. DESCRIPTIVE STATISTICS ............................................................................................... 13
1.4.4. CORRELATIONS ............................................................................................................... 18
MULTIPLE CHOICE QUESTIONS .................................................................................................... 21
R: PC Exercise: Data ....................................................................................................................... 22
R: Pc Exercise Questions: ............................................................................................................... 23
SUMMARY .................................................................................................................................... 23
TOPIC 2. CLASSICAL LINEAR REGRESSION MODEL (CLRM): OVERVIEW .....................................24
2.1. WHAT IS A REGRESSION MODEL? ......................................................................................... 24
2.2. SIMPLE REGRESSION ............................................................................................................. 24
2.2.1 THEORY ............................................................................................................................ 24
2.2.2. INTERPRETATION OF OLS ESTIMATES ........................................................................... 28
2.2.3. NONLINEAR REGRESSION MODELS ................................................................................ 31
2.3. CLASSICAL LINEAR REGRESSION MODEL (CLRM) ................................................................. 33
2.3.1. ASSUMPTIONS ................................................................................................................ 33
2.3.2. PROPERTIES OLS ESTIMATOR ........................................................................................ 34
2.4. PRECISION AND STANDARD ERRORS .................................................................................... 37
2.5. MULTIPLE REGRESSION ........................................................................................................ 39
SIMPLE REGRESSION OVERVIEW .............................................................................................. 39
EXAMPLES ................................................................................................................................. 39
2.5.1. INTERPRETATION OF OLS ESTIMATES ........................................................................... 40
R Multiple Regression: Model ....................................................................................................... 41
2.6. REGRESSION WITH DUMMY VARIABLES ............................................................................... 42
Self-Study ...................................................................................................................................... 43
MULTIPLE CHOICE QUESTIONS .................................................................................................... 43
Self-study Questions (Textbook) ...................................................................................................... 45
R: PC exercise ................................................................................................................................ 45
KEY CONCEPTS ............................................................................................................................. 46
SUMMARY .................................................................................................................................... 46
TOPIC 3. CLASSICAL LINEAR REGRESSION MODEL (CLRM): HYPOTHESIS TESTING ....................47
, 3.1. STATISTICAL INFERENCE ....................................................................................................... 47
3.1.1. HYPOTHESIS TESTING .................................................................................................... 47
3.1.2. Probability Distribution of OLS Estimators (^) .................................................................... 48
3.1.3. TEST OF SIGNIFICANCE, CONFIDENCE INTERVAL, T -RATIO .......................................... 50
R: EXAMPLE .............................................................................................................................. 58
MULTIPLE Choice Questions .............................................................................................................. 62
3.2. GOODNESS OF FIT STATISTICS: R2 AND ADJUSTED R2 ........................................................ 63
3.2.1. RESIDUAL ANALYSIS ....................................................................................................... 63
R: Residual Analysis .................................................................................................................... 64
3.2.2. Adjusted R2 (Solves problem 2) ........................................................................................... 67
3.2.3. Goodness of Fit Statistics: R^2 ........................................................................................... 68
Self-Study ...................................................................................................................................... 68
MULTIPLE CHOICE QUESTIONS .................................................................................................... 68
Self-study Questions (Textbook) ...................................................................................................... 71
PC Exercise: ................................................................................................................................... 72
KEY CONCEPTS ............................................................................................................................. 73
SUMMARY .................................................................................................................................... 73
TOPIC 4. CLRM ASSUMPTIONS AND DIAGNOSTIC TESTS 1 .........................................................74
4.1. CLASSICAL LINEAR REGRESSION MODEL (CLRM): ASSUMPTIONS + PROPERTIES ............... 74
4.1.1 CLASSICAL LINEAR REGRESSION MODEL (CLRM): ASSUMPTIONS ................................. 74
4.1.2 CLASSICAL LINEAR REGRESSION MODEL (CLRM): PROPERTIES ..................................... 74
4.2. VIOLATIONS OR PITFALLS: .................................................................................................... 74
4.2.1. ASSUMPTION 1: E(UT) ≠ 0 ............................................................................................. 75
4.2.2. ASSUMPTION 2: HETEROSKEDASTICITY ......................................................................... 76
4.2.3. ASSUMPTION 3: RESIDUAL AUTOCORRELATION ........................................................... 88
4.2.4. ASSUMPTION 4: OMITTED VARIABLE BIAS .................................................................. 103
4.2.5. ASSUMPTION 5: UT IS NOT NORMALLY DISTRIBUTED ................................................ 108
Self-study ...................................................................................................................................... 115
Multiple choice question ................................................................................................................. 115
Self study questions (textbook) ...................................................................................................... 117
R: PC Exercise: ............................................................................................................................. 117
Key concepts ............................................................................................................................... 119
Summary ..................................................................................................................................... 119
TOPIC 5: CLRM ASSUMPTIONS AND DIAGNOSTIC TESTS 2 ....................................................... 120
5.1. ADDITIONAL VIOLATIONS OR PITFALLS: ............................................................................ 120
5.1.1. MULTICOLLINEARITY .................................................................................................... 120
5.1.2. WRONG FUNCTIONAL FORM ....................................................................................... 125
5.1.3. Other Pitfalls: Overview ...................................................................................................... 127
5.2. STRATEGY FOR CONSTRUCTING MODEL ............................................................................ 127
, Self study ...................................................................................................................................... 129
MULTIPLE Choice questions ............................................................................................................. 129
some exercises............................................................................................................................... 130
R: Pc exercise: ................................................................................................................................ 130
key concepts ................................................................................................................................. 131
Summary ...................................................................................................................................... 131
TOPIC 6 : RESEARCH PROJECT .................................................................................................. 132
TOPIC 7: TIME SERIES: NON -STATIONARITY AND SPURIOUS REGRESSION .............................. 132
7.1. SPURIOUS REGRESSION ...................................................................................................... 132
7.1.1. DEFINITION .................................................................................................................. 132
7.1.2. EXAMPLE ...................................................................................................................... 132
7.1.3. DETECTION ................................................................................................................... 133
7.1.4. SOLUTION ..................................................................................................................... 134
SUMMARY ............................................................................................................................... 135
7.2. NON-STATIONARITY ............................................................................................................ 136
7.2.1. MOTIVATION AND DEFINITION .................................................................................... 136
7.2.2. DETERMINISTIC VERSUS STOCHASTIC NON -STATIONARITY........................................ 139
7.2.3. EXAMPLE ...................................................................................................................... 142
7.3. UNIT ROOT TESTING ........................................................................................................... 145
7.3.0 INTRODUCTION ............................................................................................................. 145
7.3.1. DICKEY-FULLER TEST .................................................................................................... 146
7.3.2 DF CRITICAL VALUES ..................................................................................................... 148
7.3.3. AUGMENTED DICKEY -FULLER TEST (NOT IN DETAIL!!) ............................................... 148
MULTIPLE CHOICE QUESTION .................................................................................................... 151
SELF-STUDY QUESTIONS (TEXTBOOK) ....................................................................................... 152
R: PC EXERCISES ......................................................................................................................... 153
EXERCISE 1 .............................................................................................................................. 153
EXERCISE 2 .............................................................................................................................. 153
KEY CONCEPTS ........................................................................................................................... 154
SUMMARY .................................................................................................................................. 154
TOPIC 8: GRANGER CAUSALITY AND VAR ................................................................................ 154
TOPIC 9: LIMITED DEPENDENT VARIABLE MODELS .................................................................. 155
9.1.MOTIVATION ........................................................................................................................ 155
9.1.1. Examples ........................................................................................................................ 155
9.2. LINEAR PROBABILITY MODEL (TYPE 1) ............................................................................... 156
9.2.1. EXAMPLE ...................................................................................................................... 156
9.2.2. Solution (2 alternative models) ......................................................................................... 158
9.3. Logit Model (Type 2) ............................................................................................................... 159
, 9.4. Probit Model (Type 3) .............................................................................................................. 159
9.4.1. LOGIT VERSUS PROBIT ................................................................................................. 160
9.5. R: EXAMPLE 1: LIMITED DEPENDENT VARIABLE MODEL ................................................... 160
9.5.1. Limited dependent Variable Model: LPM ........................................................................... 162
9.5.2. Limited Dependent Variable Model: Logit .......................................................................... 164
9.5.3. Limited Dependent Variable Model: Probit ........................................................................ 166
9.6. GOODNESS OF FIT FOR PROBIT AND LOGIT MODELS ........................................................ 167
9.7. EXAMPLE 2: TEST THE ORDER PECKING HYPOTHESIS ........................................................ 168
9.8. ORDERED PROBIT MODEL ................................................................................................... 170
9.8.1. EXAMPLE ...................................................................................................................... 170
SELF-STUDY ................................................................................................................................ 171
MULTIPLE CHOICE QUESTION .................................................................................................... 172
SELF-STUDY QUESTIONS (TEXTBOOK) ....................................................................................... 172
R: PC EXERCISES ......................................................................................................................... 172
KEY CONCEPTS ........................................................................................................................... 173
SUMMARY .................................................................................................................................. 173
TOPIC 10: PANEL DATA ............................................................................................................ 174
10.1 TYPES OF FINANCIAL DATA ................................................................................................ 174
10.2. Examples of Panel Data .......................................................................................................... 174
10.3. Why panel data? .................................................................................................................... 175
10.4. R: EXAMPLE ...................................................................................................................... 176
10.5. TYPES OF MODELS FOR PANEL DATA ............................................................................... 178
10.5.1. POOLED MODEL ......................................................................................................... 178
10.5.2. Individual effects models ............................................................................................... 183
10.6. FIXED OR RANDOM EFFECT MODEL ................................................................................. 192
FIXED OR RANDOM EFFECTS MODEL: FINAL REMARKS ......................................................... 195
10.7 ESTIMATING PANEL DATA MODELS ................................................................................... 195
10.8 EXTENSIONS TO INDIVIDUAL EFFECTS MODEL (EXTRA) .................................................... 195
10.9 APPLICATIONS .................................................................................................................... 196
SELF-STUDY ................................................................................................................................ 196
MULTIPLE CHOICE QUESTION .................................................................................................... 196
SELF-STUDY QUESTIONS (TEXTBOOK) ....................................................................................... 197
PC EXERCISE: RESEARCH QUESTION .......................................................................................... 197
KEY CONCEPTS ........................................................................................................................... 198
SUMMARY .................................................................................................................................. 198
TOPIC 11: EVENT STUDY ANALYSIS .......................................................................................... 199
, 1.11. WHAT IS AN EVENT STUDY? ............................................................................................. 199
11.2. OUTLINE OF AN EVENT STUDY ......................................................................................... 199
11.3. Event Study .......................................................................................................................... 204
11.3.1. Models for measuring normal performance ..................................................................... 204
11.3.2. Computing abnormal returns ......................................................................................... 204
11.3.3. Statistical properties of abnormal returns ....................................................................... 205
11.3.4. Testing for significant abnormal rerturns ......................................................................... 207
11.3.5. Aggregation of abnormal Returns .................................................................................... 207
11.3.6 Example: earnings announcements ................................................................................. 211
11.4. Event study in R: Example ..................................................................................................... 214
11.5. CROSS -SECTIONAL REGRESSIONS ..................................................................................... 215
11.6. USE OF DUMMIES IN EVENT STUDIES .............................................................................. 216
KEY CONCEPTS ........................................................................................................................... 216
SUMMARY .................................................................................................................................. 217
For the exam questions will be more like what are the advantages, disadvantages, why are certain things… Not
explain the entire model like is done in the summary. This is done here to give a full understanding of the
content. Also focus on the required exercises and R notebooks. You will not have to be able to generate the
output but will need to be able to answers the different questions like at the end every time with the given
output! So know how the different tests work, what the steps are to compute them and how you can interpret
the output and answer the questions with it.
F.ex: if asked to check for heteroscedasticity you need to know what output to look at and how to interpret the
results.
,TOPIC 1. INTRODUCTION AND DEALING WITH DATA
1.1. INTRODUCTION
Purpose of Research Methods or Econometrics
Research Methods in Finance -> Financial Econometrics
What is Econometrics?
- Literal meaning is “measurement in economics” (we do finance)
Definition of financial econometrics: The application of statistical techniques to problems in finance.
Financial Econometrics ≠ Economic Econometrics’ → Big distinction
Financial data less exposed to (the courses before focussed more on econometrics)
- Small samples problem
o If you only have a few data points, it’s very difficult for a significant result
o For example:
▪ GDP (4/year) & you have 5 years, so you only have 20 data points. The more data
points, the more increase of freedom.
▪ Stock price is daily & for 5 years – you have more data points 1250 points
- Measurement error
o GDP is very difficult to estimate/compute & you must need all data over the country. But with
stock prices we don’t have problems like this. How investors behave.
- Data revisions
o GDP, they revise the data – they can change over time. The stock price is the price of the end
of the day. It will always be the same
BUT financial data can be noisy (i.e. more difficult to separate trends/patterns from random and
uninteresting features)
• Noise refers to what is economically driven, fundamental, and what constitutes irrational
investor behaviour. The noisy part will be captured by the residual (explained later).
General Framework
1
, 1. What has already been studied in the literature. (f.ex if we would like to study the stock market we
would like to know what the variables are that can affect the stock market. These can be found in
previous research and their financial theory. Like GDP growth on stock returns.)
2. We build a model that we can estimate based on our insights form the theory. Y is the variable you
want to explain, and X includes variables that influence Y (e.g., GDP based on financial theory).
3. We Collect the data we need and transform it into the right format.
4. You must estimate the model using a formula and data.
5. The final step involves looking if our model is adequate, problems with multicollinearity,… if not we
will have to change our model . analysing errors (see Chapters 3 & 4).
6. If necessary, add variables or reformulate the model: maybe if it is not linear etc. For example, OLS
might not work, requiring an alternative model (2). Or we need to collect more data for our model
again (3), other time period… Or maybe we should you another estimation technique (4) → So we
adjust the model again until it is adequate.
7. It is crucial to interpret results properly so it can be applied later (significant or not..) . These are very
important steps!
8. Use for analysis
Functions
A function is a mapping or relationship between an input or set of inputs and an output
We write that y, the output, is a function f (x) of the input x, or → y = f (x)
- y could be a linear function of x where the relationship can be expressed on a straight line
- Or it could be non-linear where it would be expressed graphically as a curve
o Y = a + bx2
- We want to explain Y by X – Y is a function of X
If the equation is linear, we would write the relationship as → y = a + bx
- y and x are called variables and a and b are parameters
- a is the intercept and b is the slope or gradient
o B is the most important one – that will explain the Y & X
Straight lines
Example: Suppose that we were modelling the relationship between a student’s average mark, y (in percent),
and the number of hours studied per year, x
Suppose that the relationship can be written as a linear function
y = 25 + 0.05x
a = 25 & b = 0.05
• X = 0 (the students have not studied at all) => Y = 25%
• X = 1000 hours => Y = 75%
𝑌2−𝑌1 75−25
• Two data points, you can calculate the slope by => 𝑏 = = = 0.05 → If x increases by 1 the y
𝑋2−𝑋1 1000−0
by 0.05.
• 1 extra hour, so you % will increase by 0.05
• Perfect linear (all datapoints are on the line) – in reality all the points will not be on the line but rather
spread around, it will be an estimation between de points. Sometimes it is possible that a relation also is
non lineair.
2
,1.2. TYPES OF FINANCIAL DATA
Types of Financial Data
Types of Data:
1. Time-series data (yt for t = 1, ..., T) e.g. stock price every day for several years
a. The stock price for KBC in year 2000, 2001, 2002
b. It changes over time – captured by subscript t (days, weeks, years, ..)
c. T = Time
2. Cross-sectional data (yi for i = 1, ..., N) e.g. data on the stock price of N = 100 companies
a. One particular year – stock price in 2008 for KBC, ING, BNP Paribas
b. N = Companies – captured by i
3. Panel data (yit for i = 1, ..., N and t = 1, ..., T) e.g. stock price for N firms for T days
a. Cross & time combined – captured by the subscript it (over different dimensions)
y is referred to as a variable (e.g. it varies across companies, years/days)
Note: Think about degree of aggregation! (degree of clustering)
- Example: Individual house price versus house price index
o Index = house price level of all houses in Belgium
o Individual = if you want to compare all house prices houses with each other
- Using aggregated data: ’You see the big picture, but lose a lot of detail’ (House price index country
versus regions)
Time-Series Data and Frequency
Examples of time-series data:
Series Frequency
GDP or unemployment monthly, or quarterly
Government budget deficit annually
money supply weekly
value of a stock market index as transactions occur /daily
Important: Data in model should have the same frequency of observation
• GDP & Government = economic <-> Money & value of a stock = Financial
• You always need to take the lowest & estimate with the same frequency!
Cross-Sectional Data
Cross-sectional data are data on one or more variables collected at a single point in time.
Examples of cross-sectional data:
- A poll of usage of internet stockbroking services
o Do you now use a stockbroking? Answer Yes/ NO = > Cross – sectional → changes in time and
changes on responders.
- A cross-section of stock returns on the New York Stock Exchange (NYSE)
o At one moment and look at the prices of stocks at this moment = cross
- A sample of bond credit ratings for UK banks
3
, o Collect al data at one particular moment in time = Cross
Time-Series versus Cross-Sectional Data
1. Examples of problems that could be tackled using a Time-Series Regression
- How the value of a country’s stock index has varied with that country’s macroeconomic
fundamentals.
- How the value of a company’s stock price has varied when it announced the value of its dividend
payment.
- The effect on a country’s currency of an increase in its interest rate
2. Examples of problems that could be tackled using a Cross-Sectional Regression
- The relationship between company size and the return to investing in its shares
- The relationship between a country’s GDP level and the probability that the government will
default on its sovereign debt.
3. Pooled data versus Panel data
- Pooled data treats panel data as a larger cross-sectional sample by not accounting for the time
dimension. → it will neglect the fact that the data might be taken in different times.
- Pool data (or pooled data) refers to a dataset where observations are combined across time
periods and individuals (or entities), but without necessarily maintaining a consistent relationship
between the two.
- In pool data, you treat observations as if they are cross-sectional, even if they come from different
time periods.
- Panel data (or longitudinal data) refers to a dataset where the same entities (individuals, firms,
countries, etc.) are observed across multiple time periods.
It will always depend on you research question!
Qualitative and Quantitative Data
Quantitative data is numerical: e.g. share price is $25
Qualitative data is not: e.g. in a survey of companies ask if investment financed through debt (as opposed to
equity or retained earnings). Answer is Yes/No.
Dummy variable equal 0 or 1. Used for turning qualitative data into quantitative: e.g Yes =1, No = 0
Dummy = converting qualitative to Quantitative
Primary versus Secondary Data
Primary data
- Data you collect and process yourself
o It’s not already available
- Takes some time to collect
- How to obtain? Surveys, questionnaires, interviews, ...
Secundary data
4