Summery Statistics
Chapter 1 Why is my evil lecturer forcing me to learn statistics
1.1 What will this chapter tell me?
1.2 What the hell am I doing here? I don’t belong here
When numbers are involves the research involves quantitative methods. Qualitative and quantitative
research are complementary, not competing.
1.2.1 The research process
The process of data collection and analysis and generating theories are intrinsically linked: theories
lead to data collection/analysis and data collection/analysis informs theories.
1.3 Initial observation: finding something that needs explaining
A lot of scientific endeavour start not by watching, but by observing something in the world and
wondering why it happens. You need to collect data to see whether this observation is true. To do
this, you need to define one or more variables that you would like to measure.
1.4 Generating theories and testing them
You generate theories that need to be tested. A prediction from a theory is a hypothesis. Scientific
statements are ones that can be verified with reference to empirical evidence, whereas non-scientific
statements are ones that cannot be empirically tested. For example this book is the worst statistics
book ever is not a scientific statement. It cannot be proved or disapproved. Falsification is the act of
disproving a hypothesis or theory. A good theory should do the following:
1. Explain the existing data.
2. Explain a range of related observations.
3. Allow statements to be made about the state of the world.
4. Allow predictions about the future.
5. Have implications.
1.5 Collect data to test your theory
We need to decide 1) what to measure and 2) how to measure it.
1.5.1 Variables
We need to measure variables. Variables are just things that can change. They key to testing
scientific statements is to measure two variables. There are different variables:
Independent variable: A variable thought to be the cause of some effect. This term is usually
used in experimental research to denote a variable that the experimenter has manipulated.
1
, Dependent variable: A variable thought to be affected by changes in an independent variable.
You can think of this variable as an outcome.
Predictor variable: A variable thought to predict an outcome variable. This is basically another
term for independent variable (although some people won’t like me saying that; I think life would
be easier if we talked only about predictors and outcomes).
Outcome variable: A variable thought to change as a function of changes in a predictor variable.
This term could be synonymous with ‘dependent variable’ for the sake of an easy life.
Levels of measurement
Variables can be split into categorical and continuous, and within these types there are different
levels of measurement:
Categorical (entities are divided into distinct categories):
o Binary variable: There are only two categories (e.g., dead or alive/ male or female).
o Nominal variable: There are more than two categories (e.g., whether someone is an
omnivore, vegetarian, vegan, or fruitarian).
o Ordinal variable: The same as a nominal variable but the categories have a logical order
(e.g., whether people got a fail, a pass, a merit or a distinction in their exam)
Continuous (entities get a distinct score):
o Interval variable: Equal intervals on the variable represent equal differences in the
property being measured (e.g., the difference between 6 and 8 is equivalent to the
difference between 13 and 15).
o Ratio variable: The same as an interval variable, but the ratios of scores on the scale
must also make sense (e.g., a score of 16 on an anxiety scale means that the person is, in
reality, twice as anxious as someone scoring 8).
Continuous variables can also be discrete. This is quite a tricky distinction. A truly continuous variable
can be measured to any level of precision, whereas a discrete variable can take on only certain values
on the scale. For example you have a 5 point scale. The range is 1 – 5, but you cn’t say 4.32.
Continuous variable could be age, because you could be 34 years, 7 months, 21 days, 10 hours etc.
1.5.2 Measurement error
Measurement error is the discrepancy between the numbers we use to represent the thing we are
measuring and the actual value of the thing we are measuring. Self report measures will produce
larger measurement error because factors other than the one you are trying to measure will
influence how people respond to our measures.
1.5.3 Validity and reliability
Reliability means that you get the same values if the measurements are repeated. Validity means
that you measure what you want to measure. There are different types of validity:
Criterion validity: is whether you can establish that an instrument measures what it claims to
measure through comparisons to objective criteria.
Concurrent validity: when data are recorded simultaneously using the new instrument and
existing criteria.
Predictive validity: when data from the new instrument are used to predict observations at a
later point in time.
Content validity: the degree to which individual items represent the construct being measured
and cover the full range of the construct.
To be valid the instrument must first be reliable. The easiest way to test reliability is the test retest
method. This is when you test the same group of people twice: a reliable instrument will produce
similar scores at both points in time.
2
, 1.5.4 Correlational research methods
In correlational or cross-sectional research we observe what naturally goes on in the world without
directly interfering with it. Whereas experimental research we manipulate one variable to see its
effect on another. In longitudinal research you take more time periods. Correlational research is not
biased by the researcher being there (important aspect of ecological validity).
1.5.5 Experimental research methods
Most scientific questions imply a causal link between variables. David Hume said that to infer cause
and effect:
1. Cause and effect must occur close together in time (continguity)
2. The cause must occur before an effect
3. The effect should never occur without the presence of the cause
A problem with correlational evidence is the tertium quid (a third person or external factor causes
the correlation). The extraneous factors are called confounding variables. Experimental methods
strive to provide a comparison of situations in which the proposed cause is present or absent.
As a simple case, we might want to look at the effect of motivators on learning about statistics. We
randomly split students into three different groups:
1. Positive reinforcement.
2. Punishment.
3. No motivator (not praised or punished)
Two methods of data collection
We can choose between two methods:
1. Manipulate the independent variable using different entities. Different grousp of entities take
part in each experimental condition (a between groups, between subjects or independent
design).
2. Manipulate the independent variable using the same entities. This means that we give a group of
students positive reinforcement for a few weeks and test their statistical abilities and then begin
to give this same group punishment for a week before testing them again and then finally give
them no motivator and test them for a third time ( a within subject or repeated measures
design).
Two types of variation
The 2 types of variation are:
1. Unsystematic variation. This variation results from random factors that exist between the
experimental conditions (such as natural differences in ability, time, IQ, etc) is that there is a
difference in performance created by unknown factors.
2. Systematic variation. This variation is due to the experimenter doing something in one condition
but not in the other condition. For example economic growth by chimpanzees by using bananas
as a treat when the economic growth. Difference in performance created by a specific
experimental manipulation are known as systematic variation.
In repeated measures design, differences between two conditions can be caused by:
1. The manipulation that was carried out on the participants
2. Any other factor that might affect the way In which an entity performs from one time to the next
In an independent design, differences between the two conditions can also be caused by 2 things:
1. The manipulation that was carried out on the participants
2. Differences between the characteristics of the entities allocated to each of the groups
3
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper corinesomers. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €3,49. Je zit daarna nergens aan vast.