The document is a comprehensive and detailed summary of the course Research Methods in Finance, taught by Koen Ingelbrecht in the Commercial Sciences: Finance and Risk Management course at Ghent University. This summary includes all theory lessons, supplemented with carefully written and thoroughly...
TOPIC 3. CLASSICAL LINEAR REGRESSION MODEL (CLRM): HYPOTHESIS TESTING ....................47
, 3.1. STATISTICAL INFERENCE ....................................................................................................... 47
3.1.1. HYPOTHESIS TESTING .................................................................................................... 47
3.1.2. Probability Distribution of OLS Estimators (^) .................................................................... 48
3.1.3. TEST OF SIGNIFICANCE, CONFIDENCE INTERVAL, T -RATIO .......................................... 50
R: EXAMPLE .............................................................................................................................. 58
TOPIC 4. CLRM ASSUMPTIONS AND DIAGNOSTIC TESTS 1 .........................................................74
4.1. CLASSICAL LINEAR REGRESSION MODEL (CLRM): ASSUMPTIONS + PROPERTIES ............... 74
4.1.1 CLASSICAL LINEAR REGRESSION MODEL (CLRM): ASSUMPTIONS ................................. 74
4.1.2 CLASSICAL LINEAR REGRESSION MODEL (CLRM): PROPERTIES ..................................... 74
TOPIC 5: CLRM ASSUMPTIONS AND DIAGNOSTIC TESTS 2 ....................................................... 120
5.1. ADDITIONAL VIOLATIONS OR PITFALLS: ............................................................................ 120
5.1.1. MULTICOLLINEARITY .................................................................................................... 120
5.1.2. WRONG FUNCTIONAL FORM ....................................................................................... 125
5.1.3. Other Pitfalls: Overview ...................................................................................................... 127
5.2. STRATEGY FOR CONSTRUCTING MODEL ............................................................................ 127
, Self study ...................................................................................................................................... 129
9.2. LINEAR PROBABILITY MODEL (TYPE 1) ............................................................................... 156
9.2.1. EXAMPLE ...................................................................................................................... 156
9.2.2. Solution (2 alternative models) ......................................................................................... 158
9.3. Logit Model (Type 2) ............................................................................................................... 159
, 9.4. Probit Model (Type 3) .............................................................................................................. 159
9.4.1. LOGIT VERSUS PROBIT ................................................................................................. 160
9.6. GOODNESS OF FIT FOR PROBIT AND LOGIT MODELS ........................................................ 167
9.7. EXAMPLE 2: TEST THE ORDER PECKING HYPOTHESIS ........................................................ 168
9.8. ORDERED PROBIT MODEL ................................................................................................... 170
9.8.1. EXAMPLE ...................................................................................................................... 170
10.4. R: EXAMPLE ...................................................................................................................... 176
10.5. TYPES OF MODELS FOR PANEL DATA ............................................................................... 178
10.5.1. POOLED MODEL ......................................................................................................... 178
10.5.2. Individual effects models ............................................................................................... 183
10.6. FIXED OR RANDOM EFFECT MODEL ................................................................................. 192
FIXED OR RANDOM EFFECTS MODEL: FINAL REMARKS ......................................................... 195
10.7 ESTIMATING PANEL DATA MODELS ................................................................................... 195
10.8 EXTENSIONS TO INDIVIDUAL EFFECTS MODEL (EXTRA) .................................................... 195
For the exam questions will be more like what are the advantages, disadvantages, why are certain things… Not
explain the entire model like is done in the summary. This is done here to give a full understanding of the
content. Also focus on the required exercises and R notebooks. You will not have to be able to generate the
output but will need to be able to answers the different questions like at the end every time with the given
output! So know how the different tests work, what the steps are to compute them and how you can interpret
the output and answer the questions with it.
F.ex: if asked to check for heteroscedasticity you need to know what output to look at and how to interpret the
results.
,TOPIC 1. INTRODUCTION AND DEALING WITH DATA
1.1. INTRODUCTION
Purpose of Research Methods or Econometrics
Research Methods in Finance -> Financial Econometrics
What is Econometrics?
- Literal meaning is “measurement in economics” (we do finance)
Definition of financial econometrics: The application of statistical techniques to problems in finance.
Financial Econometrics ≠ Economic Econometrics’ → Big distinction
Financial data less exposed to (the courses before focussed more on econometrics)
- Small samples problem
o If you only have a few data points, it’s very difficult for a significant result
o For example:
▪ GDP (4/year) & you have 5 years, so you only have 20 data points. The more data
points, the more increase of freedom.
▪ Stock price is daily & for 5 years – you have more data points 1250 points
- Measurement error
o GDP is very difficult to estimate/compute & you must need all data over the country. But with
stock prices we don’t have problems like this. How investors behave.
- Data revisions
o GDP, they revise the data – they can change over time. The stock price is the price of the end
of the day. It will always be the same
BUT financial data can be noisy (i.e. more difficult to separate trends/patterns from random and
uninteresting features)
• Noise refers to what is economically driven, fundamental, and what constitutes irrational
investor behaviour. The noisy part will be captured by the residual (explained later).
General Framework
1
, 1. What has already been studied in the literature. (f.ex if we would like to study the stock market we
would like to know what the variables are that can affect the stock market. These can be found in
previous research and their financial theory. Like GDP growth on stock returns.)
2. We build a model that we can estimate based on our insights form the theory. Y is the variable you
want to explain, and X includes variables that influence Y (e.g., GDP based on financial theory).
3. We Collect the data we need and transform it into the right format.
4. You must estimate the model using a formula and data.
5. The final step involves looking if our model is adequate, problems with multicollinearity,… if not we
will have to change our model . analysing errors (see Chapters 3 & 4).
6. If necessary, add variables or reformulate the model: maybe if it is not linear etc. For example, OLS
might not work, requiring an alternative model (2). Or we need to collect more data for our model
again (3), other time period… Or maybe we should you another estimation technique (4) → So we
adjust the model again until it is adequate.
7. It is crucial to interpret results properly so it can be applied later (significant or not..) . These are very
important steps!
8. Use for analysis
Functions
A function is a mapping or relationship between an input or set of inputs and an output
We write that y, the output, is a function f (x) of the input x, or → y = f (x)
- y could be a linear function of x where the relationship can be expressed on a straight line
- Or it could be non-linear where it would be expressed graphically as a curve
o Y = a + bx2
- We want to explain Y by X – Y is a function of X
If the equation is linear, we would write the relationship as → y = a + bx
- y and x are called variables and a and b are parameters
- a is the intercept and b is the slope or gradient
o B is the most important one – that will explain the Y & X
Straight lines
Example: Suppose that we were modelling the relationship between a student’s average mark, y (in percent),
and the number of hours studied per year, x
Suppose that the relationship can be written as a linear function
y = 25 + 0.05x
a = 25 & b = 0.05
• X = 0 (the students have not studied at all) => Y = 25%
• X = 1000 hours => Y = 75%
𝑌2−𝑌1 75−25
• Two data points, you can calculate the slope by => 𝑏 = = = 0.05 → If x increases by 1 the y
𝑋2−𝑋1 1000−0
by 0.05.
• 1 extra hour, so you % will increase by 0.05
• Perfect linear (all datapoints are on the line) – in reality all the points will not be on the line but rather
spread around, it will be an estimation between de points. Sometimes it is possible that a relation also is
non lineair.
2
,1.2. TYPES OF FINANCIAL DATA
Types of Financial Data
Types of Data:
1. Time-series data (yt for t = 1, ..., T) e.g. stock price every day for several years
a. The stock price for KBC in year 2000, 2001, 2002
b. It changes over time – captured by subscript t (days, weeks, years, ..)
c. T = Time
2. Cross-sectional data (yi for i = 1, ..., N) e.g. data on the stock price of N = 100 companies
a. One particular year – stock price in 2008 for KBC, ING, BNP Paribas
b. N = Companies – captured by i
3. Panel data (yit for i = 1, ..., N and t = 1, ..., T) e.g. stock price for N firms for T days
a. Cross & time combined – captured by the subscript it (over different dimensions)
y is referred to as a variable (e.g. it varies across companies, years/days)
Note: Think about degree of aggregation! (degree of clustering)
- Example: Individual house price versus house price index
o Index = house price level of all houses in Belgium
o Individual = if you want to compare all house prices houses with each other
- Using aggregated data: ’You see the big picture, but lose a lot of detail’ (House price index country
versus regions)
Time-Series Data and Frequency
Examples of time-series data:
Series Frequency
GDP or unemployment monthly, or quarterly
Government budget deficit annually
money supply weekly
value of a stock market index as transactions occur /daily
Important: Data in model should have the same frequency of observation
• GDP & Government = economic <-> Money & value of a stock = Financial
• You always need to take the lowest & estimate with the same frequency!
Cross-Sectional Data
Cross-sectional data are data on one or more variables collected at a single point in time.
Examples of cross-sectional data:
- A poll of usage of internet stockbroking services
o Do you now use a stockbroking? Answer Yes/ NO = > Cross – sectional → changes in time and
changes on responders.
- A cross-section of stock returns on the New York Stock Exchange (NYSE)
o At one moment and look at the prices of stocks at this moment = cross
- A sample of bond credit ratings for UK banks
3
, o Collect al data at one particular moment in time = Cross
Time-Series versus Cross-Sectional Data
1. Examples of problems that could be tackled using a Time-Series Regression
- How the value of a country’s stock index has varied with that country’s macroeconomic
fundamentals.
- How the value of a company’s stock price has varied when it announced the value of its dividend
payment.
- The effect on a country’s currency of an increase in its interest rate
2. Examples of problems that could be tackled using a Cross-Sectional Regression
- The relationship between company size and the return to investing in its shares
- The relationship between a country’s GDP level and the probability that the government will
default on its sovereign debt.
3. Pooled data versus Panel data
- Pooled data treats panel data as a larger cross-sectional sample by not accounting for the time
dimension. → it will neglect the fact that the data might be taken in different times.
- Pool data (or pooled data) refers to a dataset where observations are combined across time
periods and individuals (or entities), but without necessarily maintaining a consistent relationship
between the two.
- In pool data, you treat observations as if they are cross-sectional, even if they come from different
time periods.
- Panel data (or longitudinal data) refers to a dataset where the same entities (individuals, firms,
countries, etc.) are observed across multiple time periods.
It will always depend on you research question!
Qualitative and Quantitative Data
Quantitative data is numerical: e.g. share price is $25
Qualitative data is not: e.g. in a survey of companies ask if investment financed through debt (as opposed to
equity or retained earnings). Answer is Yes/No.
Dummy variable equal 0 or 1. Used for turning qualitative data into quantitative: e.g Yes =1, No = 0
Dummy = converting qualitative to Quantitative
Primary versus Secondary Data
Primary data
- Data you collect and process yourself
o It’s not already available
- Takes some time to collect
- How to obtain? Surveys, questionnaires, interviews, ...
Secundary data
4
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller MK2002. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $3.97. You're not tied to anything after your purchase.