Summary Statistics I: Description and Inference Notes on Readings - GRADE 8,5
37 views 0 purchase
Course
Statistics 1 (6441HST1)
Institution
Universiteit Leiden (UL)
Book
Discovering Statistics Using IBM SPSS
Summary of the material for the final exam (2021) for Statistics I: Description and Inference. INCLUDES notes from (Total: 50 pages):
Andy Field’s article “Discovering Statistics Using IBM SPSS Statistics”, chapters 1, 2, 3 (sections 3.1-3.6 and 3.7.1-3.7.2), 4 (sections 4.1-4.12), 5, 6 (sec...
College aantekeningen Advanced Research Methods And Statistics Clinical Psychology (201900104) Discovering Statistics Using IBM SPSS Statistics, ISBN: 9781526422965
All for this textbook (6)
Written for
Universiteit Leiden (UL)
International Relations And Organizations
Statistics 1 (6441HST1)
All documents for this subject (4)
Seller
Follow
giacomoef
Reviews received
Content preview
Summary of the material for the final exam (2021) for Statistics I: Description and Inference.
INCLUDES notes from (Total: 50 pages):
● Andy Field’s article “Discovering Statistics Using IBM SPSS Statistics”, chapters 1, 2, 3
(sections 3.1-3.6 and 3.7.1-3.7.2), 4 (sections 4.1-4.12), 5, 6 (sections 6.6 and 6.10), 8
(sections 8.1-8.4 and 8.7-8.11), 10 (sections 10.1-10.8.4, 10.8.6-10.9.5 and 10.14), 12
(sections 12.1-12.3, 12.6.1. 12.7-12.7.1 and 12.11) and 19 (section 19.1-19.3.6, 19.5-19.8.2
and 19.8.4-19.8.5).
● Charles H. Franklin’s paper “The ‘Margin of Error’ for Differences in Polls”.
1
Statistics I: Description and Inference Notes on Readings
Table of Contents
“Discovering Statistics Using IBM SPSS Statistics” 2
1. Why is my evil lecturer forcing me to learn statistics? 2
2. The SPINE of Statistics 11
3. The Phoenix of Statistics 21
4. The IBM SPSS Statistics Environment 25
5. Exploring Data with Graphs 27
6. The Beast of Bias 31
8. Correlation 33
10. Comparing Two Means 36
12. GLM 1: Comparing Several Independent Means 39
19. Categorical Outcomes: Chi-square and Loglinear Analysis 43
“The ‘Margin of Error’ for Differences in Polls” 4
, 2
“Discovering Statistics Using IBM SPSS Statistics”
1. Why is my evil lecturer forcing me to learn statistics?
The Research Process
1. Initial Observation: Define one or more
variables to measure. Collect data to see
whether this observation is true (and not
a biased observation).
2. Relevant Theories: Analyse and use to
explain the data.
➔ Theory: An explanation or set
of principles that are well
substantiated by repeated
testing and explain a broad
phenomenon. It explains a wide
set of phenomena with a small
set of well-established
principles, but it cannot be
observed directly.
➔ Scientific Statements: Statements based on a good theory. HOWEVER, not all
statements can be tested using science. Scientific statements can be verified regarding
empirical evidence (i.e. confirmed or disconfirmed, provided one can quantify and
measure the variables concerned).
➔ Non-Scientific Statements: Statements that cannot be empirically tested (i.e. proved
or disproved). HOWEVER, non-scientific statements can sometimes be altered to
become scientific statements, by making them testable.
3. Hypothesis: A proposed explanation for a fairly narrow phenomenon or set of observations. It
is an informed, theory-driven attempt to explain what has been observed. Typically seeks to
explain a narrower phenomenon and is, as yet, untested; cannot be observed directly.
4. Predictions: Emerge from and operationalize the hypothesis in a way that enables the
collection and analysis of data (i.e. move from the conceptual domain into the observable
domain). The prediction, in turn, tells us something about the hypothesis from which it was
derived.
5. Data: Used in line with the predictions, proving or disproving the initial hypothesis (i.e. initial
observations are verified by data, and then using a theory, specific hypotheses are generated
and operationalized by generating predictions that could be tested using the data).
➔ Falsification: The act of disproving a hypothesis or theory (i.e. data that contradicts
the hypothesis).
6. General Statement: A concluding statement about the theory, initial observation and research
finding.
Levels of Measurement
Data Measurement: Used to test the hypotheses using:
, 3
● Variables: Things that can change (or vary) between people, locations or time. The key to
testing scientific statements and expressing hypotheses is to measure the:
1. Independent (Predictor) Variable: A variable thought to be the cause of some effect; a
variable that the experimenter has manipulated (i.e. a proposed cause).
2. Dependent (Outcome) Variable: A variable thought to be affected by changes in an
independent variable (i.e. a proposed outcome).
These variables are very closely tied to “experimental methods”, where the cause is manipulated by
the experimenter. HOWEVER, researchers can’t always manipulate variables. Instead, they
sometimes use “correlational methods”, where all variables are essentially dependent variables (use
the terms predictor variable and outcome variable instead).
Level of Measurement: The relationship between what is being measured and the numbers that
represent what is being measured. Variables can be categorical or continuous and can have different
levels of measurement.
➔ Categorical Variable: Made up of categories that names distinct entities (e.g. human).
Numbers can be used to denote categories.
◆ Binary Variable: There are only two categories (e.g. male or female).
◆ Nominal Variable: There are more than two categories (e.g. vegetarian, pescetarian,
carnivore, etc.). The only way that nominal data can be used is to consider
frequencies.
◆ Ordinal Variable: The same as a nominal variable but the categories have a logical
order (e.g., whether people got a fail, a pass, a merit or a distinction in their exam).
Ordinal data tell us that things have occurred and the order in which they occurred.
➔ Continuous Variable: A variable that gives entities a distinct score and can take on any value
on the measurement scale that we are using.
◆ Interval Variable: Equal intervals on the variable represent equal differences in the
property being measured.
◆ Ratio Variable: The same as an interval variable, but the ratios of scores on the scale
must also make sense. For this to be true, the scale must have a meaningful zero
point.
◆ Continuous Variable: A variable can be measured to any level of precision (e.g. age).
The distinction between continuous and discrete variables can become blurred.
◆ Discrete Variable: A variable can take on only certain values (usually whole numbers)
on the scale (e.g. a 5-point rating).
Measurement Error
Variables should be measured accurately and calibrated such that values have the same meaning over
time and across situations (e.g. weight). Variables can be measured directly (e.g. profit, weight,
height) or indirectly (e.g. self-report, questionnaires).
➔ Measurement Error: A discrepancy between the numbers used to represent the thing being
measured and the actual value found if it could be measured directly.
Validity and Reliability
To ensure that measurement error is kept to a minimum, the properties of the measure need to be
determined:
, 4
● Validity: Whether an instrument measures what it sets out to measure. Validity is a necessary
but not sufficient condition of a measure.
➔ Criterion Validity: Whether you can establish that an instrument measures what it
claims to measure through comparison to objective criteria. In an ideal world, you
assess this by relating scores on your measure to real-world observations. Assessing
criterion validity (e.g. concurrently or predictively) is often impractical because
objective criteria that can be measured easily may not exist. Also, with measuring
attitudes, you might be interested in the person’s perception of reality and not reality
itself.
◆ Concurrent Validity: Assessed when data are recorded simultaneously using
the new instrument and existing criteria.
◆ Predictive Validity: Assessed when data from the new instrument are used to
predict observations at a later point in time.
➔ Content Validity: Self-report measures/questionnaires can assess the degree to which
individual items represent the construct being measured, and cover the full range of
the construct.
● Reliability: The ability of the measure to produce the same results under the same conditions.
To be valid the instrument must first be reliable.
➔ Test-Retest Reliability: The easiest way to assess reliability is to test the same group
of people twice.
➔ Statistical Methods: Used to determine the reliability and measure something that
does vary over time (e.g. blood-sugar levels).
Research Design
There are two ways to test a hypothesis:
1. Correlational or Cross-Sectional Research: Observe natural events by taking a snapshot of
many variables at a single point in time, or by measuring variables repeatedly at different time
points (longitudinal research). It provides a very natural view of the question, nothing is
influenced by the bias of the researcher. HOWEVER, correlational research tells us nothing
about the causal influence of variables.
2. Experimental Research: Manipulating some aspect of the variable’s environment and
observing the effect it has on another variable. This implies a causal link between variables
(the dependent variable depends on the independent variable). Most research questions can be
broken down into a proposed cause and a proposed outcome. The key to answering the
research question is to uncover how these variables relate to each other.
➔ Cause: An object, where the cause needs to precede the effect, and causality is
equated to high degrees of correlation between contiguous events.
➔ As you might imagine, his view was a lot more complicated than this definition alone,
but let’s not get sucked down that particular wormhole.
In correlational research, variables are often measured simultaneously. This creates problems in:
● That it provides no information about the contiguity between different variables.
● Tertium Quid: A problem where causality doesn’t distinguish between what you might call an
‘accidental’ conjunction and a causal one (i.e. where an unknown third person or thing of
indeterminate character plays an influential role). Longitudinal research addresses this issue to
some extent.
● Confounding Variables: Extraneous factors.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through EFT, credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying this summary from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller giacomoef. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy this summary for $6.64. You're not tied to anything after your purchase.