Marketing Strategy Research (BM05MM) Class notes + practical sessions + case studies + Quizlet link
96 views 8 purchases
Course
BM05MM (BM05MM)
Institution
Erasmus Universiteit Rotterdam (EUR)
In this document, you can find clear explanations of all the concepts explained in class. I researched and added better definitions to understand in an easier way what the classes are about. The document contains all notes from all classes, practical sessions, and case sessions.
At the end of the ...
Lecture 1:
Linear regression - market responses (e.g. pricing)
Conjoint analysis - New product design
Bass model - New product diffusion
Cluster analysis - Segmentation
Multi-dimensional scaling - Positioning
Principles of data-driven marketing:
1. Any statistical analysis is to reduce information loss
2. Causation cannot be learned directly from data
3. Prediction does not care about statistical significance
4. Practical usefulness triumphs statistical criteria
Lecture 2: How to build up market response models
Linear regression:
Example: Establish the relationship between sales and price
IV - X: price
DV - Y: sales
The objective is to fit the relationship into a line
The point is that we have a series of data points between price and sales,
but there are many different lines… how do we know which line we want?
Think: what is the objective you have? → to build a prediction model of the
relationship
What is a good prediction?
It is a model that has a minimum difference between observation and
prediction.
*Principle 1 → any statistical analysis is to reduce information loss
The general criteria is: we want to minimize the difference between
predictions and observations
STRUCTURE TO RUN A LINEAR REGRESSION: STEP BY STEP
→ this is more of a guidance for beginners rather than a rule.
Step 1: Examining the data
● Check the association between IV and DV
,You assume that there will be a certain relationship, then you plot the data and check whether that is the case or not. In the example below we can see
that sales are also predicted by the advertising
● Detecting multi-collinearity
If there is a high correlation between 2 variables, it means
that the 2 variables have very similar information.
When we have highly correlated IV, there is a high chance
that our prediction won’t be truthful: we might get biased and
misleading coefficients
The interpretation of a regression coefficient is that it
represents the mean change in the dependent variable for
each 1 unit change in an independent variable when you hold
all of the other independent variables constant. However,
when independent variables are correlated, it indicates that
changes in one variable are associated with shifts in another variable.
There are 2 main problems when IVs are correlated with each other:
1. The coefficient estimates can swing wildly based on which other independent variables are in the model. The coefficients become very
sensitive to small changes in the model.
2. Multicollinearity reduces the precision of the estimated coefficients, which weakens the statistical power of your regression model. You
might not be able to trust the p-values to identify independent variables that are statistically significant
+ It gives biased coefficients which can cause a misunderstanding of the predictive values of IV
How do we detect multicollinearity?
Examine correlation matrix of predictors → high correlations indicate trouble
Request collinearity diagnostics → use the variance inflation factor (VIF)
A variance inflation factor quantifies how much the variance (i.e. the standard error squared) is inflated.
VIFs are calculated by taking a predictor and regressing it against every other predictor in the model.
VIFs range from 1 upwards. The numerical value of VIFs tells you (in decimal form) what % of the variance is inflated for each
coefficient.
E.g. a VIF of 1.9 tells you that the variance of a particular coefficient is 90% bigger than what you could expect if there was
no multi-collinearity.
A rule of thumb:
1= not correlated
Between 1 and 5 = moderately correlated
Greater than 5 = correlated
Grater than 10 = problem
→ the more your VIF increases, the less reliable your regression results are going to be.
→ a VIF above 10 indicates high correlation and is a cause for concern.
How do we deal with variables that are highly correlated?
- Use either one variable in the regression
- Transform the correlated variables into a mutually independent set of predictors (e.g. factor analysis)
- Collect more data!
Step 2: Formulating the model
, ● Create a conceptual model
→ to see what are the predicting variables and what is the outcome variable so that you know which variable to use as an input.
● Translate the conceptual model to an R formula
R formula basic: DV ~ IVs
Or: Y ~ b0 + X1 + … + Xk
*Check Statistical Formula Notation in R paper
Step 3: Examining the model
Any statistical analysis is to minimize information loss
Understanding Intercept and slope
● The calculated line: Sales = a+bPrice
● a is the intercept:
- Value of Sales when Price = 0
- “Baseline” prediction
● b is the slope:
- Case I: Price = 1, Sales = a+b
- Case II : Price = 2, Sales = a+2b
- Compare Case II with Case I, what are the changes in the price and sales?
→ the expected changes in predicted sales
Step 4: Validating the model
4.1 Overall Significance (F-test)
The overall significance of a model can be found in the F-statistic. It indicates whether our linear regression model provides a
better fit to the data than a model that contains NO independent variables.
There are 2 hypotheses:
H0: the model with no IV fits the data as well as your model
H1: your model fits the data better than the intercept-only model
→ Accepting H1 is good because it means that the IV in your model improves the fit of the model.
(Differently from a t-test which can assess only one regression coefficient at the time, F-test can assess multiple coefficients
simultaneously)
Naïve prediction: a prediction with only intercepts but no other IVs → intercept-only model
When we use the highest point of the normal distribution because the majority of points collect there.
The intercept-only model assumes that the best estimate of the dependent variable is the overall mean
of the sample.
E.g. Suppose you have the height data of all Dutch adult females. For a random Dutch adult female,
what is your best prediction of her hight? In an intercept-only model you would use the mean of your
data.
, In the intercept-only model, all of the fitted values equal the mean of the response variable. Therefore, if the p-value of the
F-test is significant, your regression model predicts the response variable better than the mean of the response
To validate the overall significance of the model we verify if the model is better than a naive prediction without any IVs.
→ we do this with the overall model significance: is the model gonna be useful at all?
→ we compare the model to the naive prediction → if we refuse the null hypothesis, it means that our model is different from the naive prediction →
we can keep going
4.2 Significance of coefficients
Now that you checked whether the coefficients are significantly jointly. You need to check the individual coefficients and test H0 for specific IVs
4.3 Model fit
Now we check: How good is our model? How well does it predict
→ we check R-squared which measures the model fit or strength of associations
In general, the higher the R-squared, the better the prediction
Example: which model is better?
Model 1: • 𝒀 = 𝒂 + 𝒃𝟏𝑿𝟏 + 𝒆 → 𝑹 𝟐 =. 𝟖𝟎
Model 2: • 𝒀 = 𝒂 + 𝒃𝟏𝑿𝟏 + … + 𝒃𝟏𝟎𝟎𝟎𝑿𝟏𝟎𝟎𝟎 + 𝒆 → 𝑹 𝟐 =. 𝟖𝟓
Model 1
We use adjusted coefficient of determination (Adjusted R-squared)
→ penalizing the number of IVs (k)
- To compare 2 regressions of different n. Of IVs (k)
- Used for model comparison
Is an IV valuable for prediction?
- Test significance of individual coefficient
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller Svi. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $15.04. You're not tied to anything after your purchase.