Discovering Statistics Using IBM SPSS Statistics
By Andy Field
5th Edition
9781526445780
Exhaustive summary of the book in English.
A list of common symbols in Statistics is added at the end of this summary:
• Chapter 1: Why is my evil lecturer forcing me to learn statistics?
• Chapter 2: The SPINE of statistics
• Chapter 3: The phoenix of statistics
• Chapter 4: The IBM SPSS Statistics environment
• Chapter 5: Exploring data with graphs
• Chapter 6: The beast of bias
• Chapter 7: Non-parametric models
• Chapter 8: Correlation
• Chapter 9: The Linear Model (Regression)
• Chapter 10: Comparing two means
• Chapter 11: Moderation, mediation and multicategory predictors
• Chapter 12: GLM 1: Comparing several independent means
• Chapter 13: GLM 2: Comparing means adjusted for other predictors (analysis of covariance)
• Chapter 14: GLM 3: Factorial designs
• Chapter 15: GLM 4: Repeated-measures designs
• Chapter 16: GLM 5: Mixed designs
• Chapter 17: Multivariate analysis of variance (MANOVA)
• Chapter 18: Exploratory factor analysis
• Chapter 19: Categorical outcomes: chi-square and loglinear analysis
• Chapter 20: Categorical outcomes: logistic regression
• Chapter 21: Multilevel linear models
,Chapter 1: Why is my evil lecturer forcing me to learn
statistics?
What the hell am I doing here? I don’t belong here
In this chapter, the importance of statistics is discussed, as well as its fundamental concepts. Data is
required to answer various questions. Therefore, a teacher emphasizes the importance of working
with numbers to students, because these numbers are a form of data and are a part of the research
process.
In addition to numbers, other forms of data exist. Studies based on figures use a quantitative
method to do research, while studies that are mainly based on language research use a qualitative
method. The qualitative and quantitative method are complementary to each other, meaning that
they can be used to enhance or emphasize each other’s qualities.
The research process
The research process consists of a number of steps. The first step is observation; something is
observed that evokes curiosity. Consequently, a researcher has a question that he or she would like
to have answered. To determine if the observation is correct, data must be collected. A researcher
needs variables to collect this data. A variable is something that is measured to answer the question
of the researcher.
The research process is as follows: Formulate a research question -> test a theory -> write a
hypotheses -> make predictions -> collect data to test the predictions -> analyze this data.
Initial observation: finding something that needs explaining
One can find something that needs to be explained in many different ways. For example, when
watching the news on television, a research question may arise about something that is going on in
the world. To formulate an answer to this question, data must then be collected. To collect these
data, one must also collect variables that have to be set and defined.
Generating and testing theories and hypotheses
After formulating a research question, the next step is to test a theory and to write a hypothesis. A
hypothesis is an explanation for a certain phenomenon or a set of observations. A hypothesis is set
by explaining data, and data can be explained by using a theory. Based on this theory, a prediction
can be made. This prediction based on a theory, is called a hypothesis. You can only speak of a
hypothesis when it is a statement that can be proven or rejected by using scientific methods. If the
collected data contradicts the theory or the hypothesis, a falsification occurs.
What is the difference between a dependent and an
independent variable?
If people want to collect data, it is important that we ask two things: (1) what is measured and (2)
how is it measured? To test the hypotheses, the variables must be measured. Variables are things
, that can vary, between people, between situations or over time. With most hypotheses there are two
variables; the cause and the outcome.
The variable that is seen as the cause of a certain effect is called the independent variable or
the predictor. In an experimental set-up, this term is used to emphasize that the researcher has
manipulated this variable. The variable that changes due to changes in the independent variable is
called the dependent variable or outcome variable.
What is meant by a measurement level?
Variables can be measured in various ways. The relationship between what is measured and the
numbers that express what you are measuring is called the level of measurement.
Variables can be categorical or continuous and can have different measurement levels.
A categorical variable consists of different categories. An example of a categorical variable is the
division between men and women. In this case the variable has only two categories; a man or a
woman. You can't be both. A variable with two categories is called a binary variable.
If a variable consists of more than two categories that are linked to each other, it is called a nominal
variable. An example of a nominal variable is religion (Judaism, Christianity, Islam, etc.). Although
these categories can also be represented with numbers, it is not possible to perform mathematical
calculations with these numbers. These figures do not indicate a ranking with a nominal variable.
An example of a nominal variable that is represented by numbers is the back number of a player in
a team sport. A higher back number does not mean that someone is a better player. Nominal data
can only be used to look at frequencies, for example how often a certain player scores, or how many
people have a certain belief.
With an ordinal variable, there are also different categories, but these categories have a certain
ranking. For example, ordinal data indicates a specific order. It is, however, snot specified how large
the difference is between the categories. A top three in a competition indicates who has done
it better than the other. Because of this it has a sequence, but it does not say how much better the
winner was than the number two and three.
At the next measurement level you no longer have a categorical variable, but continuous
variables. A continuous variable is a score that can assume any value that is used on the
measurement scale. The interval variable is a form of a continuous variable. With the interval
variable, the difference between all numbers is the same. An example of this is a scale where you
indicate how nice you find someone on a five-point scale. The difference between 1 and 2 is the
same as the difference between 4 and 5. This measurement level is most often used for statistical
tests.
The next measurement level is the ratio variable. The ratio variable has the same conditions as the
interval variable, but the ratio variable has an absolute and meaningful zero point. This means that
you can multiply the numbers of a ratio variable. An example of this is reaction time; a millisecond
always lasts the same length, so the differences between the milliseconds are the same, but you can
also say that 200 milliseconds is twice as long as 100 milliseconds. A continuous variable does not
always have to be continuous, it can also be a discrete variable. A real continuous variable can take
on all possible values, but with a discrete variable only certain values can be chosen (usually only
rounded numbers). If you indicate how nice you find someone on a five-point scale, it is a
continuum, where 2.98 is a meaningful value, but you can only choose the numbers 1, 2, 3, 4 and 5.
You cannot actually enter 2.98.
What is a measurement error?
Researchers prefer a measurement that is the same over time and in different situations. He or she