Introduction lesson
1.1. Defining multivariate analysis
It refers to all statistical methods that simultaneously analyze multiple measurements on each
individual or object under investigation.
➔ Almost every real life marketing problem requires statistical analysis of several variables: you
need them in your toolkit!
➔ Crucial for Master Thesis:
o Translate marketing problem
o Collect data
o Analyze using R
1.2. Some basic concepts
1.2.1. Measurement Scales
- Nonmetric: nominal, ordinal
- Metric: interval, ratio
- No absolute 0 point
- 7-point likert scale
- Arithmetic average, range, standard deviation, product-moment correlation + previous
methods.
Ratio
- Absolute 0 point
- Age, cost, number of customers
- Geometric average, coefficient of variation + all previous methods
1.2.2. Errors: reliability and validity
- Reliability = is the measure consistent? = the degree to which multiple measurements give
the same results → test-retest
- Validity = does the measure capture the concept it is supposed to measure? = the degree to
which the scores of a measure represent the variable they are intended to
,1.2.3. Statistical Significance and Power
Hypothesis testing
= testing whether something is (different) from
0. For example “does advertising affect sales?”
You decide that there is a difference, but there is
none in reality → how can we make sure that
that problem becomes smaller? That the risk
that we conclude there is a difference, when in
reality there is not? → You want to set a cutoff
(you need your measure to be higher than your
cutoff)
- You conclude that something is different, but in reality it is not = type 1 error (false positive)
- You conclude that there is no difference, but in reality there is = type 2 error
- We are trying to reduce both types of mistakes, the way we are going to do that:
o Allow type 1 error to 5% (alpha 0.05)
o Live with the fact that we can make a type 2 error
Power
Power depends on:
- Alpha (a) → (+) = if you are willing to accept a higher type 1 error, the power will be higher
- Effect size → (+) = if a difference is bigger in reality, you have a higher chance of finding that
difference in your test
- Sample size (n) → (+) = if you look at bigger sample sizes, your test will have a higher power
Implications:
- Anticipate consequences of alpha, effect and n
- Assess/incorporate power when interpreting results
1.3. Types of Multivariate Methods
Dependence Techniques
- 1 or more variables can be identified as dependent variables and the remaining as
independent variables
- Choice of dependence technique depends on the number of dependent variables involved in
analysis
Interdependence Techniques
- Whole set of interdependent relationships is examined
- Further classified as having focus on variable or objects
,Highlights of Chapter 2 – Self-study!
2.1. Conduct preliminary analysis: graphical inspection and simple analyses
Why?
- Get a feel for the data
- Suggest possible problems (and remedies) in next step
How?
- Univariate profiling
- Bivariate analysis
2.2. Detect outliers
How can we detect outliers?
- Univariate
- Bivariate
- Multivariate
2.3. Examining missing data
Missing data leads to:
- Reduced sample size (respondents can not be included in the sample)
- Possibly biased outcomes if missing data process not random
➔ 4 step approach for identification and remedying
1. Determine type of missing data → ignorable or non-ignorable missings?
2. Determine extent (%) of missing data → by variable, case, overall
3. Diagnose randomness of missing data → systematic, missing at random, missing completely
at random
4. Deal with the missing data problem → Remove Cases or variables with missing values, use
imputation (replace missing observations by an average)
, Step 3: Diagnose the randomness of missing data
Lecture 2: ANOVA
Step 1: Defining Objectives
Test whether the treatments (categorical variables) lead to different levels for a (set of) metric
outcome variables, for example:
- Does online ad design, in particular: position of picture and logo, affect the click-through rate
(DV)?
- How does visit frequency (1 or 2 a year) and use of samples (yes/no) affect physician
prescriptions (DV)?
- How does promo activity affect store sales and traffic (DV)?
➔ De DV is metric (interval/ratio scale) and the drivers for the input variables are non-metric.
They have to take on a discreet value (nominal/ordinal)
Overview of approaches
➔ Example: analysis of Store Sales and Traffic → How does promo activity affect store sales
and traffic? → 2 drivers: coupon activity (1 = 20euro/visit or 2= none) and promotion
intensity (1= high, 2=medium, 3=low)
➔ You see a picture of a data set, each row is a store (30 in total) and the different columns are
the different variables. Rating column = the wealth in region 1 to 10.
When we look at this set there are different questions:
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller JoelleSmit. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $5.94. You're not tied to anything after your purchase.