Methodology M&SM week content
explained
Content
Factor analysis:.......................................................................................................................................1
AN(C)OVA:..............................................................................................................................................3
Regression analysis:................................................................................................................................6
Partial Least Squares:.............................................................................................................................8
Exam preparation:................................................................................................................................10
Questions:........................................................................................................................................10
Answers............................................................................................................................................12
Factor analysis:
What is a factor analysis? the purpose of a factor analysis is to estimate a model which explains
variance and/ covariance between a set of observed variables (in a population) by a set of (fewer)
unobserved factors and weightings. A factor analysis is an interdependent technique that defines
structure among variables. It shows interrelationships among large number of variables to identify
underlying dimensions. It is a data summarization and reduction. Within a factor analysis you can
increase reliability and validity of measures with multi-item measurement. This allows measurement
assessment of the measurement error, reliability, and validity. You can have measurement models in
two different ways: formative (emerging) and reflective (latent). Within a reflective measurement
model the direction of causality is from construct to measure. It takes measurement error into
account at the item level. The validity of the items is usually testes with factor analysis.
When conducting a factor analysis, all the methods
follow a similar process, from problem formulation to
model fit (see image).
1. Problem formulation here the objectives of
the factor analysis should be identified, data
summarization and data reduction.
Measurement properties should contain ratio
and/ or interval). The sample size is 4-5*N per
variable. Within a factor analysis you make a
distinction between:
- Exploratory factor analyses. Here you are looking for an underlying structure,
assumptions that superior factors cause correlations between variables, reveal
interrelationships or generation of hypotheses.
- Confirmatory factor analyses. Here you look at a priori of underlying factors, derived
from theory. You test relationships between variables and factors before conducting the
factor analysis, you will test the hypothesis.
2. Construct the correlation matrix analytical process is based on a matrix of correlations
between the variables. Useful statistics at this stage are:
, - Kaiser-Meyer-Olkin (KMO), this measures the of sampling adequacy. should be above
.5 and the closer to 1 the better.
- Bartlett’s test of sphericity, this tests the null hypothesis that the variables are
uncorrelated in the population. should be smaller than .05.
3. Selecting an extraction method This is an important step within the factor analysis process.
Usually two major types of extraction methods. You can make a distinction between:
- Principal component analysis, here you look at the total variance in the data. The
diagonal of the correlation matrix consists of unities. Full variance is brought into the
factor matrix. The primary concern of this analysis is the
minimum number of factors that will account for
maximum variance these factors are called principal
components. Within this analysis each variable is,
mathematically, expressed as a linear combination of the
components. The covariation among the variables is described in terms of a small
number of principal components. If the variables are standardized, the principal
component model may be represented as
- Common factor analysis, within a common factor analysis factors are estimated based
only on the common variance. Communalities are inserted in the diagonal of the
correlation matrix. The primary concern of the common factor analysis is to identify the
underlying dimensions and their common variance, this is also known as principal axis
factoring. Each variable is, mathematically, expressed as a linear combination of
underlying factors. The covariation among the variables is described in terms of a small
number of common factors plus a unique factor for each
variable. If the variables are standardized, the factor
model may be represented as
When looking at a factor matrix you want the factor loadings of the
correlation between the variable and the factor to be at least around
0.5 (minimum), as it is significant when above 0.5, and would desire a
loading of above 0.7.
4. Determining number of factors a priori determination,
based on eigenvalues (>1), scree plot, percentage of variance
(in total >0.6) and/or split-half reliability. When determining
the eigenvalues, you look at the factors that have an
eigenvalue higher than 1. When looking at a scree plot you
look at the moment the plot snaps. When looking at the percentage of variance, you want a
cumulative variance of 60% or more. It can happen that you need to rotate the factors
(happens for interpretation reasons). In rotating the factors: each factor should have
nonzero, or significant, loadings or coefficients for some of the variables. Likewise, you would
like each variable to have nonzero or significant loadings with only a few factors, if possible,
with only one. Imagine the factors are on an axis which is skewed, to make the axis fit better
with the actual data points, the program rotates the axis. Because of this the factors become
more easily interpretable. Within rotating you can rotate:
- Orthogonal, when the axis is maintained at an angle of 90 degrees the rotation is called
orthogonal, a.k.a. Varimax. You use this rotation when you assume factors are not
correlated (this should be based on theoretical considerations).
- Oblique, when the axis is not maintained at an angle of 90 degrees the rotation is called
oblique, a.k.a. Oblimin. You use this rotation when you allow factors to be correlated
(this should be based on theoretical considerations).