Introduction Marketing Research summary
Chapter 1 Overview of Multivariate methods
Key terms
- Alpha (α) = see type 1 error
- Beta (β) = see type 2 error
- Bivariate partial explanation = simple (two variable) correlation between two sets of residuals
(unexplained variances) that remain after the association of other independent variables is
removed.
- Bootstrapping= an approach to validating a multivariate model by drawing a large number of
subsamples and estimating models for each subsample. Estimates from all subsamples are then
combined providing not only the ‘best’ estimated coefficients ( e.g. means of each estimated
coefficient across all subsample models) but their expected variability and thus their likelihood
of differing from zero; that is, are the estimated coefficients statistically different from zero or
not? This approach does not rely on statistical assumptions about the population to assess
statistical significance but instead makes its assessment based solely on the sample data.
- Composite measure= see summated scales
- Dependent technique= classification of statistical techniques distinguished by having a variable
or set of variables identified as the dependent variable and the remaining variables as
independent. The objective is prediction of the dependent variable by the independent variable.
An example is regression analysis.
- Dependent variable= presumed effect of, or response to, a change in the independent variable.
- Dummy variable= nonmetrically measured variable transformed into a metric variable by
assigning a 1 or 0 to a subject, depending on whether it possesses a particular characteristic.
- Effect size= estimate of the degree to which the phenomenon being studied exist in the
population.
- Independent variable= presumed cause of any change in the dependent variable.
- Indicator= single variable used in conjunction with one or more other variables to form a
composite measure.
- Interdependence technique= classification of statistical techniques in which the variables are
not divided into dependent and independent set; rather, all variables are analyzed as a single
set.
- Measurement error= inaccuracies of measuring the ‘true’ variable values due to the fallibility of
the measurement instrument, data entry errors or respondent errors.
- Metric data= also called quantitative data, interval data, or ratio data, these measurements
identify or describe subjects not only on the possession of a attribute but also by the amount or
degree to which the subject may be characterized by the attribute, For example, a person’s age
and weight.
,- Multicollinearity= Extent to which a variable can be explained by other variables in the analysis.
As multicollinearity increases, it complicates the interpretation of the variate because it is more
difficult to ascertain the effect of any single variable, owing to their interrelationships.
- Multivariate analysis= analysis of multiple variables in a single relationship or set of
relationships.
- Multivariate measurement= Use of two or more variables as indicators of a single composite
measure. For example, a personality test may provide the answers to a series of individual
questions (indicators), which then are combined to form a single score (summated scale)
representing the personality trait.
- Nonmetric data= Also called qualitative data, these are attributes, characteristics, or categorical
properties that identify or describe a subject or object. They differ from metric data by
indicating the presence of an attribute, but not the amount. Also called nominal data or ordinal
data.
- Power= Probability of correctly rejecting the null hypothesis when it false; that is, correctly
finding a hypothesis relationship when it exists. Determined as a function of (1) the statistical
significance level set by the researcher for a type 1 error, (2) the sample size used in the analysis
and (3) the effect size being examined.
- Practical significance= Means of assessing multivariate analysis results based on their
substantive findings rather than statistical significance. Whereas statistical significance
determines whether the result is attributable to chance, practical significance assesses whether
the result is useful in achieving the research objectives.
- Reliability= extent to which a variable or set of variables is consistent in what it is intended to
measure. If multiple measurements are taken, the reliable measures will all be consistent in
their values. It differs from validity in that it relates not to what should be measured, but instead
to how it is measured.
- Soecification error= omitting a key variable from the analysis, thus affecting the estimated
effects of included variables.
- Summated scales= method of combining several variables that measure the same concept into a
single variable in an attempt to increase the reliability of the measurement through multivariate
measurement. In most instances, the separate variables are summed and then their total or
average score is used in the analysis.
- Treatment= independent variable the researcher manipulates to see the effect on the
dependent variable.
- Type 1 error= probability of incorrectly rejecting the null hypothesis – in most cases, it means
saying a difference or correlation exists when it actually does not. α
- Type 2 error= probability of incorrectly failing to reject the null hypothesis – in simple terms, the
chance of not finding a correlation or mean difference when it does exist. β (1 – β = Power)
- Univariate analysis of variance (ANOVA) = statistical technique used to determine, on the basis
of one dependent measure, whether samples are from populations with equal means.
- Validity = extent to which a measure or set of measures correctly represents the concept of a
study – the degree to which it is free from any systematic or nonrandom error. Validity is
concerned with how well the concept is defined by the measures, whereas reliability relates to
consistency of the measures.
, - Variate= Linear combination of variables formed in the multivariate technique by deriving
empirical weights applied to a set of variables specified by the researcher.
1.1 Multivariate analysis in statistical terms
- Multivariate analysis refers to all statistical techniques that simultaneously analyze multiple
measurements on individuals or objects under investigation.
- All the variables must be random and interrelated in such ways that their different effects
cannot meaningfully be interpreted separately.
1.2 Some basic concepts of multivariate analysis
The variate
- A linear combination of variables with empirically determined weights. The variables are
specified by the researcher, whereas the weights are determined by the multivariate technique
to meet a specific objective.
- A variate of n weighted variables ( X1 to Xn) can be stated mathematically as:
o Variate value = 𝑤1𝑋1 + 𝑤2𝑋2 + 𝑤3𝑋3 + ⋯ + 𝑤𝑛𝑋𝑛
o Where Xn is the observed variable and wn is the weight determined by the multivariate
technique.
- The result is a single value representing a combination of the entire set of variables that best
achieves the objective of the specific multivariate analysis.
- The variate captures the multivariate character of the analysis.
Measurement scales
- The researcher must define the measurement type – nonmetric or metric – for each variable.
Nonmetric measurement scales
- Describe differences in type or kind by indicating the presence or absence of a characteristic or
property. nominal/ordinal
o Nominal = assigns numbers as a way to label or identify subjects or objects. (categorical)
Can only represent categories or classes and do not imply amounts of an attribute or
characteristic. Gender
o Ordinal = variables can be ordered or ranked in relation to the amount of the attribute
possessed. Provide no measure of the actual amount or magnitude in absolute terms,
only the order of the values.
Metric measurement scales
- Used when subjects differ in amount or degree on a particular attribute. interval/ratio
Have constant units of measurement, so differences between two adjacent points on any part of
the scale are equal.
o Interval = Interval scales use an arbitrary (no) zero point. Therefore it is not possible to
say that any value on an interval scale is a multiple of some other point on the scale.
Fahrenheit/Celcius
o Ratio= has an absolute zero point. weighing machine
, Impact of choice
- So that nonmetric data are not incorrectly used as metric data, and vice versa.
- Critical in determining which multivariate techniques are the most applicable to the data.
Measurement error and multivariate measurement
- Measurement error= the degree to which the observed values are not representative of the
‘true’ values. The ‘true’ effect is partially masked by the measurement error, causing the
correlations to weaken and the means to be less precise.
Validity and reliability
- Validity= the degree to which a measure accurately represents what it is supposed to.
- Reliability= the degree to which the observed variable measures the ‘true’ value and is ‘error
free’. (consistency)
Employing multivariate measurement
- Summated scales, for which several variables are joined in a composite measure to represent a
concept. Objective is to use several variables as indicators, all representing differing facets of
the concept and obtain a more well-rounded perspective.
- Multiple responses reflect the ‘true’ response more accurately than does a single response.
The impact of measurement error
- Reducing measurement error, may improve weak or marginal results and strengthen proven
results as well.
1.3 Statistical significance versus statistical power
Types of statistical error and statistical power
- By specifying the alpha level, the researcher sets the acceptable limits for an error and indicates
the probability of concluding that significance exists when it really does not.
- Type 1 error becomes more restrictive (moves closer to zero) as the probability of a Type 2 error
increases. Reducing type 1 error reduces the power of the statistical test.