100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
AMDA FALL summary €10,89
In winkelwagen

Samenvatting

AMDA FALL summary

2 beoordelingen
 169 keer bekeken  16 keer verkocht

All video lectures (excluding the really practical examples done in SPSS and R). It includes all the slides and extensive explanations made by the lecturer in a clear way. It is 63 pages long.

Voorbeeld 4 van de 61  pagina's

  • 15 januari 2022
  • 61
  • 2021/2022
  • Samenvatting
Alle documenten voor dit vak (10)

2  beoordelingen

review-writer-avatar

Door: veerlexsophie • 10 maanden geleden

review-writer-avatar

Door: turtle23 • 2 jaar geleden

avatar-seller
lys96
AMDA FALL SUMMARY
2021 – 2022
Lysanne Groenewegen
Leiden University

,TOPIC 1: PCA & CFA

Lecture 1: PCA
PCA is very widely applied: We use it in social sciences for data reduction, questionnaires, design, etc.

Which questions predict the overall variable better (PCA):
- If you have 4 single variables that measure SES in some way, can’t you just use one weighted
average?
o No, you have to look at how much the individual single variables contribute to the overlaying
construct.
- How many sub concepts of intelligence can we distinguish?
o Here we use PCA to see whether there are groups in the items where we test intelligence
with.
- What chemicals possess similar properties under heat / pressure / /…?
- Quantify ethnic spread in (sub)populations

In a graph:
• The arrows are the variables
• The dots are observations
• The horizontal and vertical axis are principal components




For example:
With PCA (bv genetic makeup) you can correct for certain variables. based on genetic info, you can
pinpoint their localization on a map. It is very important to realize that there can only be 1 component, which
is the component that explains the biggest percentage of the overarching construct. We will dive into this
deeper later.

Scale construction:
Sometimes, your PCA natural groups do not match your theoretical groups (rephrase questions)
• You have an idea on how to design a questionnaire for some concept
• You design questions that address several sub concepts
• But are the supposed items actually adequate in their subdomain?
• Which items do you choose for subscales of your instrument, and how reliable are these subscales

E.G. Intelligence:
- 1 concept? (general intelligence)
- 2 concepts? (verbal and performance)
- 3 concepts? (Verbal, performance, freedom from distractibility)
(This lecture: examples of scale construction, but also holds for dimension reduction)

PCA
PCA is a method where we can get an idea about any underlying, not pre-assumed, structure. This means
that we use PCA on datasets for which we have no theoretical background (or assumed structure yet).
PCA is therefore a bottom-up exploratory method, which visualizes our dataset nicely. With CFA, on the
other hand, we do have a pre-assumed structure, and we do have a theoretical background. CFA is
therefore called a top-down confirmatory method. We will get into this in lecture 2.

,When you do PCA in SPSS, it computes a correlation matrix and starts to decompose the correlation matrix
(the backbone of PCA). By doing so, you have lost the information on your individuals because your unit of
analyses has become the correlation between variables, instead of an observation of a subject on a
variable. So the unit has changed to bivariate associations, and we want to find groups of high bivariate
associations.
- We explore data for a structure of PCA
- External (theoretical) knowledge is used afterwards for interpretation
- Analyses performed in SPSS of R
- From data/model theory

Explore with PCA, test with CFA
Test, in CFA, the structure that was suggested by PCA. What PCA will do is compute and analyze the
correlation structure (correlations between variables). Subgroups can be found when searching for
variables that correlate high amongst each other, but not with other subgroup of variables (= you find
groups of highly correlating variables).

Questions:
- How strong should the variables correlate with each other?
- How many subgroups do exist in my dataset?
- If I find groups, which variables belong to which group (component)?
Suppose this stage is successful and you find a structure that is reasonable. Then you can actually test (in
a slightly stricter way) with CFA whether this model is actually useful for predictions or real explanations
(instead of just a suggestion). PCA gives you the best suggestion it can give, but the best suggestion might
still be complete rubbish (it just cannot do any better). Sometimes (not often though) this can happen, but
then you usually have a not so sound dataset.




8 predictor variables
How do you combine these into a single score?


This is an example of already knowing the subscale (A & B), but this is an
unweighted, very straight forward scale construction. It’s the linear unweighted
sum (linear because it is a sum, unweighted because all of the variables have
an exactly equal contribution to the sum score = 1). This means that all the A
variables are in the scale, all the B’s are on another scale.
- Variable either in out of a scale
- Weights c in either 1 (in) or0 (out)
- Variables determine scale interpretation
- Equal or no contribution to construct

But: is it really true that all the A items contribute equally to the scale? How
do we find out?
- More subtle inclusion (or exclusion)
- Weighted contribution to component
- c anything between 0 and 1
- some variables more, others less impotent

- Values c < .30: exclusion
- Values c > .30: inclusion?
- Values c > .50: inclusion
- Values c > .80: for clinical instruments

How do I get these numbers?
The c weights are component loadings. Finding these weights only work under certain conditions.
- Component is linear combination of items

, - PCA searches for these linear combinations such that Cronbach’s A (reliability coefficient for
each of the subgroups, for each of these components) is as high as possible (largest possible
variance of all combinations). Because, if you have a high reliability coefficient, it means that you
have a lot of joint/explained variance in this group of variables. There is only 1 principle
component – 1 component that is the most important one
- Second, find the next highest combinations, such that they are uncorrelated with all previous
components

Visual explanation
Left box:
- Total amount of information that you have: Imagine that we have 10 variables. If we are
analyzing 10 variables using correlations, we are working on a standardized normal scale (because
correlations are standardized). So based on the variables, the mean is 0 and the SD is 1. Therefore,
the variance is also 1 (because a correlation is the standardized version of the covariance).
Knowing that we are analyzing the correlation coefficient, and knowing that we have 10 variables,
we know that we have 10 standardized variables with each contributing 1 point of variance. The
total surface of this box s 10 points.
- We want to explain 10 points of variance. PCA it finds the weighted combination of variables that
takes the largest possible bite out of the surface of 10 info points. It does not explain everything: we
can do the same trick again: we want to find the component that explains the largest amount out of
the remaining black part. Which is
always going to be lower that
component 1.

The whole box is explained when we
have found 10 components. If I use as
many components as variables will have 100% explained variance, but that is not data reduction, so we do
not want that.

This whole story only works if we center the variables that you put into your analysis. Picture this:
- 4 variables;
- 1 var is scored on a scale of 1 to 5;
- 2nd var is scored on scale of 10 – 50;
- 3rd var has range of 1 to 200.000;
Which variables of these has the largest variance (obviously variable nr. 3.). Thus, what will happen in a
component analysis is the largest variance will determine the component. If you don’t center (so
standardize), you don’t use correlations but covariances. In this case, your first component will be
determined by the variable that simply has the largest variance (which is nr 3, not because it is more
important, but simply because the scale is so different from the other ones).

Rules of thumb for final scale
We consider a scale ‘reasonable’ when its reliability a equals
- > .70 for new scales
- > .80 for standardized instruments
- > .90 preferably, before we use new scales in practice (clinical practice)
In practice: think, test, rethink, retest, repeat..

Choosing a number of components
Final decision is based on
- Explained variance: e.g. if you have 2 components (from 15 variables) that explain 60% of
everything, that means that out of 15 variance (so above 8 points in the first 2 components) = how
strong is the strongest part of my model
- Eigenvalue > 1 (use with care) – amount of variance points covered – tricky: eigenvalue larger than
1 means that we have found a component (an aggregated score) that explains more as an
aggregate of several variables compared to what a single variable would do. But. if you have 200.00
variables, it is VERY easy to find components that explain more than 1 point of variable due to
sheer random chance of correlation (so only works with small samples).

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper lys96. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €10,89. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 56326 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen
€10,89  16x  verkocht
  • (2)
In winkelwagen
Toegevoegd