BUSINESS RESEARCH METHODS – PROF CLEEREN
Inhoudsopgave
1 linear regression analysis................................................................................................................. 2
1.1 When to use a linear regression? ............................................................................................ 2
1.2 Creating dummy variables ....................................................................................................... 3
1.3 Example linear regression ....................................................................................................... 4
1.4 Linear regression in Stata ........................................................................................................ 5
1.4.1 Model diagnostics – Steps ............................................................................................... 6
1.5 Model comparison approach ................................................................................................ 13
2 Research methodology: Moderation and mediation ................................................................... 15
2.1 What is moderation? ............................................................................................................. 15
2.2 How to test for moderation? ................................................................................................. 16
2.3 What is mediation? ............................................................................................................... 17
2.4 How to test for mediation according to Baron and Kenny .................................................... 18
2.5 Sobel test and bootstrapping ................................................................................................ 20
2.5.1 sobel test ....................................................................................................................... 20
2.5.2 bootstrapping ................................................................................................................ 21
2.6 examples ................................................................................................................................ 22
3 Logistic Regression......................................................................................................................... 23
3.1 example logistic regression ................................................................................................... 24
4 Factor Analysis ............................................................................................................................... 28
4.1 Introduction to factor analysis............................................................................................... 28
4.2 Factor analysis in Stata .......................................................................................................... 28
4.2.1 running fa in 5 steps ...................................................................................................... 28
4.3 example ................................................................................................................................. 29
4.3.1 exercise .......................................................................................................................... 34
5 panel data ...................................................................................................................................... 37
5.1 Different types of data........................................................................................................... 37
5.2 panel data .............................................................................................................................. 37
1
,1 LINEAR REGRESSION ANALYSIS
1.1 WHEN TO USE A LINEAR REGRESSION?
Linear regression versus logistic regression?
* Categorical variables need to be converted to dummy variables (binary: 1/0)!
First: think which technique is valuable before starting analysing
Dependent: what you want to explain
Independent: variables that explains
Metric: variable has not categories and any value can be possible --> numbers don’t mean
anything
Categorical: you have different categories --> every category get a number so the numbers
have a meaning
Categorical independent variable = create dummy variables
More than 2 groups: multinomial logistic regression (we will not discuss this)
Less than 2 or 2 groups: binary logistic regression
ANOVA: typical technique to analyse experimental data
Exercises
1) a person´s decision to buy a private (store) label
Survey:
‘I tend to buy private labels very often (8 to 10 of my grocery purchases is a private label)’
O Yes
O No
Private label: brand that is offered by the store (e.g. Carrefour brand)
Which technique are we using to analyse this?
Binary logistic regression because our dependent variable has 2 groups and some of the customer
characteristics are not metric
2
,2) Someone´s attitude towards buying private label (or store) products
Survey:
‘I tend to buy a lot of private labels’
1 2 3 4 5 6 7 (1= totally disagree.. 7 = totally agree)
Which technique are we using to analyse this?
Likert scale: these numbers have a meaning (because 7 agrees much more)
Linear regression analysis
Metric
3) someone´s attitude towards buying private labels
Survey:
‘I am a person who:
O buys private labels 8 to 10 times in 10 grocery purchases
O buys private labels 4 to 7 times in 10 grocery purchases
O buys private labels 1 to 3 times in 10 grocery purchases
O buys private labels 0 times in 10 grocery purchases
Which technique are we using to analyse this?
Categoric dependent variable
Independent variable: there is at least 1 categoric variable (gender)
Multinomial logistic regression: dependent is categorical and has more than 2 groups
→ You can use different questions to measure it
1.2 CREATING DUMMY VARIABLES
- Transform categorical independent variables into dummy (1/0) variables (aka indicator
variables) in a linear (and logistic) regression
- Dummy variable trap!
▪ # dummies = # response categories – 1
- Gender: Male:
O Male (1) O Yes (1)
O Female (2) O No (0)
Dummy variable = 0/1 variable
If we have 3 categories we only include 2 (male, female, others --> male and female) → WHY?
Because of perfect multicollinearity if we include 3 variables. If we have 3 categories, we only
need info from 2 categories because we can predict information for the 3the one based on the
other 2.
Use ‘tabulate’ and ‘generate’ command
- The command ‘tabulate+generate’:
▪ Returns frequencies of ‘gender’
▪ Automatically recodes into
dummy variables
3
, - Two new dummy variables appear in the ‘variables’ window
Stata creates a dummy variable for each category
- Age:
O < 20 (1)
O 20-35 (2)
O 36-50 (3)
O > 50 (4)
- How many dummy variables?
Use of i.prefix in stata
- i. prefix before the name of the variable can be used in many commands in STATA
▪ F.e. i.gender or i.age
- i.prefix makes sure that the specified variable is treated as a categorical variable
- STATA will include the right numner of dummy variables in the analyses
1.3 EXAMPLE LINEAR REGRESSION
- The manager of a pizza restaurant wants to research the impact of different factors on consumer
satisfaction [Satisfaction].
- On the basis of discussions with employees, 5 factors
- were identified that could play a role:
▪ Reception [reception]
▪ Service [service]
▪ Waiting time [waiting time]
▪ Quality of the food [food quality]
▪ Price [price]
4
Inhoudsopgave
1 linear regression analysis................................................................................................................. 2
1.1 When to use a linear regression? ............................................................................................ 2
1.2 Creating dummy variables ....................................................................................................... 3
1.3 Example linear regression ....................................................................................................... 4
1.4 Linear regression in Stata ........................................................................................................ 5
1.4.1 Model diagnostics – Steps ............................................................................................... 6
1.5 Model comparison approach ................................................................................................ 13
2 Research methodology: Moderation and mediation ................................................................... 15
2.1 What is moderation? ............................................................................................................. 15
2.2 How to test for moderation? ................................................................................................. 16
2.3 What is mediation? ............................................................................................................... 17
2.4 How to test for mediation according to Baron and Kenny .................................................... 18
2.5 Sobel test and bootstrapping ................................................................................................ 20
2.5.1 sobel test ....................................................................................................................... 20
2.5.2 bootstrapping ................................................................................................................ 21
2.6 examples ................................................................................................................................ 22
3 Logistic Regression......................................................................................................................... 23
3.1 example logistic regression ................................................................................................... 24
4 Factor Analysis ............................................................................................................................... 28
4.1 Introduction to factor analysis............................................................................................... 28
4.2 Factor analysis in Stata .......................................................................................................... 28
4.2.1 running fa in 5 steps ...................................................................................................... 28
4.3 example ................................................................................................................................. 29
4.3.1 exercise .......................................................................................................................... 34
5 panel data ...................................................................................................................................... 37
5.1 Different types of data........................................................................................................... 37
5.2 panel data .............................................................................................................................. 37
1
,1 LINEAR REGRESSION ANALYSIS
1.1 WHEN TO USE A LINEAR REGRESSION?
Linear regression versus logistic regression?
* Categorical variables need to be converted to dummy variables (binary: 1/0)!
First: think which technique is valuable before starting analysing
Dependent: what you want to explain
Independent: variables that explains
Metric: variable has not categories and any value can be possible --> numbers don’t mean
anything
Categorical: you have different categories --> every category get a number so the numbers
have a meaning
Categorical independent variable = create dummy variables
More than 2 groups: multinomial logistic regression (we will not discuss this)
Less than 2 or 2 groups: binary logistic regression
ANOVA: typical technique to analyse experimental data
Exercises
1) a person´s decision to buy a private (store) label
Survey:
‘I tend to buy private labels very often (8 to 10 of my grocery purchases is a private label)’
O Yes
O No
Private label: brand that is offered by the store (e.g. Carrefour brand)
Which technique are we using to analyse this?
Binary logistic regression because our dependent variable has 2 groups and some of the customer
characteristics are not metric
2
,2) Someone´s attitude towards buying private label (or store) products
Survey:
‘I tend to buy a lot of private labels’
1 2 3 4 5 6 7 (1= totally disagree.. 7 = totally agree)
Which technique are we using to analyse this?
Likert scale: these numbers have a meaning (because 7 agrees much more)
Linear regression analysis
Metric
3) someone´s attitude towards buying private labels
Survey:
‘I am a person who:
O buys private labels 8 to 10 times in 10 grocery purchases
O buys private labels 4 to 7 times in 10 grocery purchases
O buys private labels 1 to 3 times in 10 grocery purchases
O buys private labels 0 times in 10 grocery purchases
Which technique are we using to analyse this?
Categoric dependent variable
Independent variable: there is at least 1 categoric variable (gender)
Multinomial logistic regression: dependent is categorical and has more than 2 groups
→ You can use different questions to measure it
1.2 CREATING DUMMY VARIABLES
- Transform categorical independent variables into dummy (1/0) variables (aka indicator
variables) in a linear (and logistic) regression
- Dummy variable trap!
▪ # dummies = # response categories – 1
- Gender: Male:
O Male (1) O Yes (1)
O Female (2) O No (0)
Dummy variable = 0/1 variable
If we have 3 categories we only include 2 (male, female, others --> male and female) → WHY?
Because of perfect multicollinearity if we include 3 variables. If we have 3 categories, we only
need info from 2 categories because we can predict information for the 3the one based on the
other 2.
Use ‘tabulate’ and ‘generate’ command
- The command ‘tabulate+generate’:
▪ Returns frequencies of ‘gender’
▪ Automatically recodes into
dummy variables
3
, - Two new dummy variables appear in the ‘variables’ window
Stata creates a dummy variable for each category
- Age:
O < 20 (1)
O 20-35 (2)
O 36-50 (3)
O > 50 (4)
- How many dummy variables?
Use of i.prefix in stata
- i. prefix before the name of the variable can be used in many commands in STATA
▪ F.e. i.gender or i.age
- i.prefix makes sure that the specified variable is treated as a categorical variable
- STATA will include the right numner of dummy variables in the analyses
1.3 EXAMPLE LINEAR REGRESSION
- The manager of a pizza restaurant wants to research the impact of different factors on consumer
satisfaction [Satisfaction].
- On the basis of discussions with employees, 5 factors
- were identified that could play a role:
▪ Reception [reception]
▪ Service [service]
▪ Waiting time [waiting time]
▪ Quality of the food [food quality]
▪ Price [price]
4