100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
Summary Quantitative research methods (Bridging MBA - KUL) - G. Dierckx & P. Teirlinck €14,99
In winkelwagen

Samenvatting

Summary Quantitative research methods (Bridging MBA - KUL) - G. Dierckx & P. Teirlinck

1 beoordeling
 207 keer bekeken  5 keer verkocht

This is a thorough summary of the course quantitative research methods with examples and all SPSS steps and screenshots needed to understand the course material. INCOMPLETE Only one part is missing (the last part of time series: part 4). I still believe this could be a useful aid.

Laatste update van het document: 3 jaar geleden

Voorbeeld 10 van de 77  pagina's

  • 23 november 2020
  • 10 maart 2021
  • 77
  • 2020/2021
  • Samenvatting
Alle documenten voor dit vak (2)

1  beoordeling

review-writer-avatar

Door: osier-signets0r • 8 maanden geleden

avatar-seller
MBAstudentKUL
QUANTITATIVE RESEARCH METHODS (STATISTICS)
#




MULTIPLE LINEAR REGRESSION ANALYSIS

A. model specification

B. model fit and inference

C. goodness of fit

D. assumptions



TIME SERIES

1. cautions

2. autocorrelation

3. stationarity

4. dynamic models (missing part)

, MULTIPLE LINEAR REGRESSION

MLR - a. model specification


MULTIPLE VERSUS SIMPLE REGRESSION


- Simple linear regression: 1 independent variable x
- Multiple linear regression: k x-variables, k>1

Example Hamburger Chain

Research question: to assess the effect of different price structures and different levels of advertising expenditure on
the sales, the management sets different prices, and spends varying amounts on advertising, in different cities. Does
an increase in advertising expenditure lead to an increase in sales? If so, is the increase in sales sufficient to justify
the increased expenditure?

Random experiment: pick a random store of a chain in a random city

Y = sales: monthly sales (in 1000$)
x1 = price: ‘average’ price for products (in $)
x2 = advert: monthly advertising expenditure (in 1000$)


MULTIPLE LINEAR REGRESSION

- As in simple linear regression, the model consists of:
- A systematic part that provides us with information on how a combination of x-outcomes results in an
average value for Y: μY|x
- A random error term ε to account for the fact that Y|x is a random variable




- Graphically:
- Multiple linear regression is not represented by a line any more. It can be visualized using a (hyper)plane.

Example Hamburger Chain

Y = Sales
x1 = Price
x2 = Advert




CLASSICAL MULTIPLE LINEAR REGRESSION

- The assumptions that were introduced for simple linear regression remain. In addition, in assumption A4 now
we make two assumptions about the explanatory variables.

- Classical assumptions for multiple linear regression (A4 is where it differs from SLR):
.




.




A1: μY|x = β0 + β1 x1 + … + βK xk (ε has mean zero for all x)
A2: ε has constant standard deviation σ- homoskedasticity
A3: cov(εi,εj)=cov(Yi,Yj) =0
A4: Variables xi are non random (which can be relaxed to the assumption that x is not correlated with the error
term) and are not exact linear functions of the other explanatory variables (means that x1 and x2 should be
different enough)
A5: (optional) ε is normally distributed

, MULTIPLE REGRESSION MODEL


INTERPRETATION OF THE PARAMETERS

- Intercept β0 : Average value for Y if all x=0 is often not relevant. However except in very special cases, we
always include an intercept in the model, even if it has no direct economic interpretation. Omitting it can lead to
a model that fits the data poorly and that does not predict well.

- Coefficients βi : A slope in the xi direction, measures the effect of a change in the variable xi upon the
expected value of y, ceteris paribus = if all other variables held constant

As such it is linked to the partial derivative


Example Hamburger Chain

Y = sales
x1 = price
x2 = advert

β0 : interpretation for price = 0, and advert = 0 is not realistic
β1 : the change in monthly sales (1000$) when the price index Price is increased by one unit (1$) and advertising
expenditure Advert is held constant.



MODEL SPECIFICATION

- It is important to carefully think about the regression model specification:
- What functional form? μY|x = f(x)
- Linear function versus non-linear functions
- How to account for qualitative x-variables ?
- How to account for interaction effects between x-variables ?
- Choice of explanatory variables ?


—————————————————————————————————————————————————————
NON-LINEAR MODELS
NON-LINEAR RELATIONSHIPS

- As in simple linear regression, non-linear relationships can be modeled using a multiple ‘linear’ regression
model through the use of appropriate transformations.
- You should be lead by economic theory, experts, taking into account eg. slope properties.
- Does model provide a good fit for the data?

Example Hamburger Chain

We initially hypothesized that sales revenue is linearly related to price and advertising expenditure:

SALES = β0+β1 PRICE + β2 ADVERT

But is this a good choice ? Remember that before we suggested that adding ADVERT² or using the logarithm of
ADVERT might be a good idea.


TRANSFORMATIONS

- The logarithmic transformation is a common transformations in economical applications.
- Polynomial functions: when we studied these models with the simple regression mode, we were constrained
by the need to have only one right-hand-side variable, such as Y = β0 + β1 x². Now, within the framework of
the multiple regression model, we can consider unconstrained polynomials with all their terms included. It is
sometimes true that having a variable and its square or cube in the same model causes collinearity problems
(see later)
- exam: you will be told if you have a transformation and which transformation you will have to do
- project: you will have to think about it yourself but you can always ask the teacher if you are lost

, NON-LINEAR MODELS


LOG TRANSFORMATIONS


Example Hamburger Chain

Consider model with ln(Advert)

Sales = β0 + β1 Price + β2 ln(Advert) + ε

instead of linking sales linearly to the advertisement expenditures, you link sales to the ln of the advertisement
expenditures

When the advertisement expenditure increases on average by 1%, the sales increase approximately by 0.03456
(1000$), =34.56$, ceteris paribus.

What about log-log model?




POLYNOMIAL MODEL

Example Hamburger Chain

Consider the quadratic model

Sales = β0 + β1 Price + β2 Advert + β3 Advert² + ε

What sign do you expect for β2, β3?
- instead of ln of adv. takes square of adv.
- take the derivative of sales in regard of advert.
- depending on what the advert. is your sales increase differently
Parabola:
- advert. has a positive sign: an increase in adv. leads to an increase in sales
- advert. has a negative sign: so the effect gets smaller as your x increases

When advertising is increased by 1 unit ($1000), this does not always have the same effect on the Sales. In
this case the effect is positive, but the effect becomes smaller as Advert increases

—————————————————————————————————————————————————————
DUMMY VARIABLES
DUMMY VARIABLES

- Variables with only 2 outcomes are called indicator variables = dummy variables
- Usually the 2 outcomes are coded by 1 or 0, to indicate the presence or absence of a characteristic or to
indicate whether a condition is true or false.
- The value D = 0 defines the reference group of elements for which the characteristic is not present.

if characteristic is present
if characteristic is not present


Example Price House

Reference group: houses not in the desirable neighbourhood.

if property is in desirable neighbourhood
if property is not in desirable neighbourhood

, DUMMY VARIABLES

QUALITATIVE VARIABLES

- Dummy variables are used to account for qualitative factors in econometric models.
- Even if numbers are used to code the outcomes of qualitative factors, do NOT use these codes as such in
the regression model. Introduce dummy variables!


QUALITATIVE VARIABLES WITH 2 OUTCOMES

Example Price Houses

Y = price
x1 = SQFT = area measured in square feet
x2 = variable indicating whether house in desirable neighbourhood => create dummy variable D

PRICE = β0 + β1 SQFT + β2 D + ε



INTERPRET COEFFICIENT LINKED TO DUMMY

- Write down separate regression models for the outcomes of the qualitative variable.

Example Price Houses

PRICE = β0 + β1 SQFT + β2 D + ε

D=1 (desirable neighbourhood):
^
(price) = 20.543 + 50.058 + 0.123 SQFT

D=0 (not desirable neighbourhood):
^
(price) = 20.543 + 0.123 SQFT

A house in the desirable neighbourhood will have a price that is on average 50.058 units higher than a house
with the same SQFT which is not in the desirable neighbourhood (reference group: D = 0), ceteris paribus.

- In general: •Y = β0 + β1 x1 +…+ βi D +…+ βK xK + ε
- Interpretation βi ? Write down separate regression models for the outcomes of the qualitative variable
- In general: if D=1, then the value of Y is on average βi units larger compared to the reference group, ceteris
paribus.




- Graphically:
- Adding the dummy variable D to a simple regression model causes a parallel shift in the relationship
by the amount β2

Example Price Houses




Conclusion we take at this time, but will change later on.

, DUMMY VARIABLES


QUALITATIVE RANDOM VARIABLE WITH SEVERAL CATEGORIES

- If a qualitative variable has M>2 outcomes, one has to introduce M-1 dummy variables

Example Test Results

Y = test score
x1 = study time in hours
x2 = highest diploma (1=master, 2=bachelor, 3=high school)

DB = 1 if bachelor, 0 else
DM = 1 if master, 0 else
High school = reference group
SPSS: new variable: DM
SPSS: Transform - Recode into different variables then do the same for DB



Master degree
Bachelor degree
High school degree = reference group = double 0

X2 and last two columns give same info, we will use the last two columns in our model.



INTERPRET COEFFICIENTS OF DUMMIES

- Write down separate regression models for the outcomes of the qualitative variable.

Example Test Results




- Interpretation: If Di = 1, then Y increases on average by βi units in group i compared to reference group,
ceteris paribus.

- Graphically:
- Adding a qualitative effect to a simple regression model results in parallel lines

Example Test Results




Maybe not realistic. If the test result is about basic things, such basic things that someone with a Master’s
degree actually learns this material more or less, so that when he/she studies, his/her test scores will not go
up that much anymore. Whereas, someone with a high school degree studying that material will have a
higher effect. Example of studying an additional hour not having the same effect for all diplomas.

, INTERACTION VARIABLES

INTERACTION

Example Price House

Y = price
x1 = SQFT = area (square feet) desirable
x2 = dummy variable:
1 desirable neighbourhood, 0 otherwise

The effect of the area of the house is not the
same in the different neighbourhoods undesirable
SPSS: Graphs - chart builder - scatter/dot -
grouped scatter - x-axis: SQFT - y-axis: price
- set color: D - double click on image -
elements - lines sub groups

- What do we see? The slope of the houses in a desirable neighbourhood is larger than the one of houses in
an undesirable neighbourhood. If we thus increase the square foot by a certain amount, it will have a higher
effect on the houses in a desirable neighbourhood than on the price in an undesirable neighbourhood.
- When the total influence of 2 explanatory variables on Y is not just the sum of the 2 separate effects, but
the effect of one variable is affected by another, there is said to be an interaction effect.
- It can occur with any 2 explanatory variables, but it rarely occurs when 2 quantitative variables are
involved. A quantitative and a qualitative variable OR two qualitative variables will interact more often.

Example Prise House
- Suppose the effect of the area of the house is not same in the different neighbourhoods. That is there is an
interaction effect between SQFT an D.


INTERACTION EFFECT IN REGRESSION MODEL

- This can be accomplished by adding another explanatory variable in the model that consists of the
product of the 2 interacting variables

SPSS: Transform - compute variable

Example Price Houses

- The interaction effect between SQFT and D can be taken up in the model by adding a variable which is
the product of both variables.

PRICE = β0+ β1 SQFT + β2 D + β3 SQFT x D + ε

Y = Price
SQFT = square feet


if in desirable neighbourhood
if not in desirable neighbourhood (reference group)



INTERACTION BETWEEN QUANTITATIVE AND QUALITATIVE

- Writing down the model for the
different dummy outcomes also
helps to interpret the parameters.

Example Price Houses ->

, INTERACTION VARIABLES

INTERACTION BETWEEN QUANTITATIVE AND QUALITATIVE

- In this case the interaction variable is also called a slope-indicator variable or a slope dummy variable.
- Examining the regression function for the different dummy outcomes illustrates best the effect of the
slope dummy graphically.

Example Price Houses




Example Test Results




two times: spss: transform -
compute variable




INTERACTION BETWEEN 2 QUALITATIVE VARIABLES

- Examining the regression function for the different dummy outcomes illustrates best the effect of the
variables. It helps to interpret the parameters.

Example Wages




- Holding the effect of education constant,
we estimate that black males earn $4.17
per hour less than white males, white
females earn $4.78 less than white males,
and black females earn $5.11 less than
white males.

, INTERACTION VARIABLES

BETWEEN 2 QUANTITATIVE VARIABLES

Example Pizza

Pizza = annual expenditure on pizza ($)
Age = age in years
Income = income (in 1000$)

For a linear model:
^
PIZZA = 342.88 - 7.576 AGE + 1.832 INCOME

Marginal propensity to spend on pizza = 1.832, that is for a given level of income, the expected
expenditure changes by 1.832$ with an additional income of 1000$.

BUT: Is it reasonable to expect that this marginal propensity does not depend on age?

It seems more reasonable to assume that as a person ages, less of each extra dollar is expected to be
spent.



—————————————————————————————————————————————————————
CHOICE OF VARIABLES
WHICH INDEPENDENT VARIABLES?

- You should be lead by economic theory, experts. But your choice will also depend on choices (priors)
eg. demand depends on prices of complements and substitutes: which ones?
- You can make two errors:
- Taking up an irrelevant variable in the model.
- Not taking up a relevant x-variable in the model, which is called an omitted variable.


ILLUSTRATION OMITTED VARIABLE

Example Female Labor Force Participation

Y = yearly family income
x1 = husband’s years of education (HE)
x2 = wife’s years of education (WE)




Literature: wife’s years of education is relevant in explaining the household’s income.
- Model 1: FAMINC = β0 + β1 HE + ε (WE is an omitted variable)
- Model 2: FAMINC = β0 + β1 HE + β2 WE + ε

, CHOICE OF VARIABLES
- Magnitude of coefficient of HE changes: B1 is biased in model 1 (model with ommited variable).
- Remember:
- if model specification correct and assumptions hold, then estimator B1 is unbiased: E(B1)=β1
(on average you are estimating the correct thing)
- now the model specification of model 1 is not a good, the bias E(B1)-β1 ≠0, if WE is omitted.
(the difference (bias) is not zero)
-The sign of the bias is positive (E(B1)-β1>0) if WE is omitted: We see that the effect of HE is
overestimated, if WE is omitted. (we systematically overestimate)
- Part of the effect of WE is taken over by HE. Since there is a positive correlation between HE and WE
and the effect of WE is positive, the bias of B1 is positive.

Another Illustration Omitted Variables

Y = yearly female income
x1 = husband’s years of education (HE)
x2 = wife’s years of education (WE)
x3 = nr of children < 6 years old (KL6)

Literature: number of children is relevant in explaining the income
- Model 1: FAMINC= β0 + β1 HE + β2 WE + ε ; KL6 omitted
- Model 2: FAMINC= β0 + β1 HE + β2 WE + + β3 KL6 + ε
- Now the coefficients of HE and WE do not change a lot, if the significant variable KL6 is omitted. This is
so because KL6 is not highly correlated with the education variables.

CONSEQUENCES

- True model: -> B2 also ends up in the error term

but the variable x2 is omitted


- That is, omitted variable is taken up in the error term.

- But the other variables are usually not independent from the omitted variable. So assumption A4 (no
correlation between x and the error term) is violated. As such the estimators are not BLUE. (U = unbiased)

- The estimators are not BLUE: more specifically:
- Bias (model is suspect), since part of the effect of the omitted variable on Y, is assigned to the
variables that are in the model.
- Bias gets worse if the omitted variable Xom is correlated more to the variables in the model
- Sign of the bias of Bin (the estimator of the coefficient βin linked to a variable Xin still in the
model) can be determined by:


I I

Bom = Beta Omitted correlation of variable that is still in the model, and variable that is omitted


DETECTION
- Hard:
- especially if small bias
- moreover: which variable to add?
- It might be worth considering an omitted variable of the resulting model does not show the expected
behaviour (eg. wrong sign)

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

√  	Verzekerd van kwaliteit door reviews

√ Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, Bancontact of creditcard voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper MBAstudentKUL. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €14,99. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 52507 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen
€14,99  5x  verkocht
  • (1)
In winkelwagen
Toegevoegd