100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
Summary Statistics by Andy Field 5th edition 2018 $6.73   Add to cart

Summary

Summary Statistics by Andy Field 5th edition 2018

11 reviews
 1469 views  79 purchases
  • Course
  • Institution
  • Book

This is a summary of the fifth edition of the book 'Discovering Statistics Using IBM SPSS' by Andy Field (2018). I wrote this summary for the course statistics in the premaster business administration at the Radboud University. The summary contains all chapters that are part of the course load for ...

[Show more]

Preview 5 out of 38  pages

  • Unknown
  • December 7, 2018
  • 38
  • 2018/2019
  • Summary

11  reviews

review-writer-avatar

By: evavandeklundert • 2 year ago

review-writer-avatar

By: sanne_zwagemaker • 3 year ago

review-writer-avatar

By: pepijnrutten • 3 year ago

review-writer-avatar

By: noadegraaf • 3 year ago

review-writer-avatar

By: dfvsdf • 3 year ago

review-writer-avatar

By: danielnelles • 4 year ago

review-writer-avatar

By: valzast21 • 3 year ago

Show more reviews  
avatar-seller
Discovering Statstcs ssing IBM SPSS Statstcs sth editon
Andy Field 2018

ISBN: 9781526419521



Chapter 1. Why is my evil lecturer forcing me to learn statstcsc

1.2 What the hell am I doing herec
To answer interestng ueston you need data. Numbers are a form of data and vital for the
research process. When numbers are involved, it is uanttatve data. When words are involved, it
is ualitatve data.

1.3 The research process
Inital observaton (literature study, research problem, research ueston)  generate theory 
generate hypothesis  collect data to test theory  Analyze data.




1.4 Inital observaton: fnding something that needs explaining
Come up with a ueston that needs an answer. Collect some data, defne variables, measure
variable, does data support observatonc

1.s Generatng and testng theories and hypotheses
Explain your data:
1) Look for relevant theories (an explanaton or a set of principles that is well substantated by
repeated testng and explains a broad phenomenon)
2) Generate a hypothesis (a proposed explanaton for a narrow phenomenon or a set of
observatons, an informed theory driven atempt to explain)
3) Operatonalize your hypothesis by using predictons. Predictons make something observable.
4) Collect data
s) Data analyzaton
6) Compare results with hypothesis

Falsification: the act of disproving a hypothesis or theory.

,1.6 Collect data: measurement
1.6.1 Independent and dependent variables
A variable that we think is a cause is known as an independent variable (predictor variable). A
variable that we think is an efect is called a dependent variable (outcome variable). A variable is
something that changes: people, locatons and tme.

1.6.2 Levels of measurement
Level of measurement: the relatonship between what is being measured and the numbers that
represent what is being measured is known as the level of measurement.

Categorical variables = made up of categories

Binary variable: an entty can be placed into only one of the two categories (alive or dead, pregnant
and not pregnant)

Nominal variable: can be placed into more than two possibilites. The only way that nominal data
can be used is to consider fre uencies.

Ordinal variable: when variables are ordered. Does not tell us anything about the diferences
between values.

Contnuous variables = one that gives a score for each person and can take on any value at the
measurement scale that is used.

Interval variable: e ual intervals on the scale represent e ual diferences in the property being
measured.

Ratio variable: meets the re uirements of an interval variable, but also the ratos of values along the
scale should be meaningful. Must have a true zero point.

Contnuous variables (usually infnite) can be contnuous but also discrete. A truly contnuous
variable can be measured to any level of precision, whereas a discrete variable (usually fnite) can
take on only certain values on the scale (scale from 1 to s, e.g.). a contnuous scale would be
something like age, which can be measured at an infnite level of precision.

1.6.3 Measurement error
We want our measure to be calibrated such that values have the same meaning over tme and across
situatons.

Measurement error: a discrepancy between the numbers we use to represent the thing we are
measuring and the actual value of the thing we are measuring.

1.6.4 Validity and reliability
Validity: whether an instrument actually measures what it sets out to measure.

Criterion validity: whether you can establish that an instrument measures what it claims to
measure through comparison to objectve criteria (observatons e.g.). Assessing criterion validity is
often impractcal because objectve criteria that can be measured easily may not exist.

Concurrent validity: When data are recorded simultaneously using the new instrument and
existng criteria this assesses concurrent validity.

When data from the new instrument are used to predict observatons at a later point in tme, this is
said to asses predictive validity.

,Content validity: the degree to which individual items represent the construct being measured,
and cover the full range of the construct.

Ecological validity: we are not infuencing what happens, and the measures should not be biased
by the researcher being there.

Reliability: the ability of the measure to produce the same results under the same conditons.
Whether an instrument can be interpreted consistently across diferent situatons. To be valid the
instrument must frst be reliable.

Test re-test reliability = test same group twice.

1.7 Collectng data: research design

If we simplify things uite a lot then there are two ways to test a hypothesis: either by observing
what naturally happens, or by manipulatng some aspect of the environment and observing the
efect it has on the variable.

In correlatonal or cross-sectonal research we observe what naturally goes on in the world without
directly interfering with in, whereas in experimental research we manipulate one variable to see its
efect on another.

1.7.1 Correlatonal research methods

We observe natural events. We can do this at one point in tme (cross-sectonal) or at multple points
in tme (longitudinal).

1.7.2 Experimental research methods

Independent variable  dependent variable
Proposed cause  proposed outcome
Variables measured simultaneously (no info on contguity)

Limitation = tertium quid = a third person or thing of indeterminate character

Confounding variables = external factors that causes both

1.7.3 Two methods of data collecton

Experiments, two ways
- test diferent enttes (a between groups, between subjects or independent design)
- manipulate independent variable using the same enttes (within subject or repeated measure
design)

1.7.4 Two types of variaton

snsystematc = small diferences in performances created by unknown factors
Systematc = diferences created by experimental manipulaton.

Diferences in repeated measure design
- manipulaton of partcipants
- any other factor that might afect the way in which enttes perform from tme to tme

Diferences in independent design

, - manipulaton of partcipants
- diferences in characteristcs of the enttes allocated to each of these groups

1.7.s Randomizaton

Keeping the unsystematc variaton as small as possible to get a more sensitve measure of the
experimental manipulaton.

Systematc variaton in repeated measures
- Practce efects = perform diferent second tme due to being familiar with the experimental situaton
- Boredom efects = perform diferent second tme because of boredom and tredness after the frst
tme.

They produce no systematc variaton between our conditons by counterbalancing the order in
which a person partcipates in a conditon.

1.8 Analysing data
1.8.1 Fre uency distributons
Frequency distributionn histogram: a graph of how many tmes each score occurs. Fre uency
distributons can be very useful for assessing propertes of the distributon scores and can be used
regardless of measurement levels.

Normal distribution: a bell-shaped curve, that implies that the majority of the scores lie around
the center of the distributon. Further away from the center the bars get smaller, implying that the
scores start to deviate from the center their fre uency is decreasing.

There are two main ways in which a distributon can deviate from normal:

• Lack of symmetry (skew). Skewed distributons are not symmetrical and instead the most fre uent
scores are clustered at one end of the scale. A skewed distributon can be either positvely skewed
(the fre uent scores are clustered at the lower end) or negatvely skewed (the fre uent cores are
clustered at the higher end).
• Pointyness (kurtosis). Defers to the degree to which scores cluster at the ends of the distributon
(tails) and this tends to express itself in how pointy a distributon is. A distributon with positve
kurtosis has many scores in the tails (heavy-tailed distributon) and is pointy. This is known as a
leptokurtc distributon. In contrast, a distributon with negatve kurtosis is relatvely thin in the
tails (light tails) and tends to be fater than normal. This is called platykurtc.

In a normal distributon the values of skew and kurtosis are 0.

1.8.2 The center of a distributon
We can measure where the central tendency (center of a fre uency distributon) lies. There are
three measures commonly used: the mode, the median and the mean.

1.8.2 The mode
The mode is the score that occurs the most fre uently in the data set. One problem with the mode is
that there can be two modes, which is said to be bimodal, and data sets with more than two modes
are multmodal (nominal, ordinal, interval, rato).

1.8.3 The median
Median: the middle score when scores are ranked in order of magnitude. When we have an even
number of scores the will not be a middle value. Then the median is calculated by added the two

, middle numbers and divide them by 2. The median is relatvely unafected by skewed distributons
and can be used with ordinal, interval and rato data.

1.8.4 The mean
The mean is the measure of central tendency containing the average score. To calculate the mean we
add up all of the scores and then divide by the total number of scores. The mean can be infuenced
by extreme scores and can be afected by skewed distributons and can only be used with interval or
rato data. An advantage is that it uses every score compared to the mode and the median  interval,
rato.

1.8.s The dispersion in a distributon
Range: take the largest score and subtract it from the smallest score. This is afected dramatcally by
extreme scores (ordinal, interval, rato).

Interquartile range: cut of the top and botom 2s% of scores and calculate the range of the middle
s0%. The advantage of the inter uartle range is that it is not afected by extreme scores  ordinal,
interval, rato.

Quartiles are the three values that split the data into four e ual parts. First, we calculate the median,
which is called the second uartle. The lower uartle is the median of the lower half of the data and
the upper uartle is the median of the upper half of the data. The median is not included in the two
halves when they are split.

Quantiles: uantles are values that split a data set into e ual portons, and in the case of uartles
they are uantles that split the data into four e ual parts. You can have other uantles as
percentles (100 e ual parts) and nontles (nine e ual parts).

Deviance: if we use the mean as a measure of the centre of a distributon then we can calculate the
diference between each score and the mean.

To get the sum of squared errors (SS), you can add up the s uared deviances. We can use this as an
indicator of the total dispersion (total deviance of scores from the mean).

��� �� ������� ������

The total dispersion is a bit of a nuisance because we cannot compare it across samples that difer in
size. There it can be useful to work with the average dispersion, which is also known as the variance.
The variance (interval, rato) is the average error between the mean and the observatons made.
SS
�������� =
N−1


There is one problem with the variance as a measure: it gives us a measure in units s uared, so we
often take the s uare root of the variance which is known as the standard deviation (interval, rato).

The sum of s uares, variance and standard deviaton are all measures of the dispersion or spread of
date around the mean. A small standard deviaton indicates that de data points are close to the
mean. A large standard deviaton indicates that the date points are distant from the mean.

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller RUStudent42. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $6.73. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

83637 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$6.73  79x  sold
  • (11)
  Add to cart