Samenvatting

Summary statistics

Name: Summary statistics
SKU: doc_624864
Rating: 4.00 (1 reviews)
Author: jaccoverbij

1 beoordeling

192 keer bekeken 5 keer verkocht

Vak
Statistics (MANMOR004)

Instelling
Radboud Universiteit Nijmegen (RU)

This summary contains the following chapters: - chapter 1.1-1.9 - chapter 5.4-5.5 - chapter 2.1-2.9 - chapter 6.6 - chapter 19.1-19.3.2; 19.5-19.6 - chapter 8.1-8.4.3; 8.5; 8,7; 8.8 - chapter 10.1-10.6 - chapter 7.1-7.5 - chapter 12.2-12.3 - chapter 9.1-9.8; 9.10-9.10.2; 9.11-9.11.3; 9.14 At the e...

[Meer zien]

Voorbeeld 4 van de 44 pagina's

Bekijk voorbeeld

Heel boek samengevat? Nee
Wat is er van het boek samengevat? - chapter 1.1-1.9 - chapter 5.4-5.5 - chapter 2.1-2.9 - chapter 6.6 - chapter 19.1-19.3.2; 19.5-19.6
Geupload op 17 december 2019
Aantal pagina's 44
Geschreven in 2019/2020
Type Samenvatting

statistics
spss
field
statistiek
pre master bedrijfskunde
andy field
statistics field
pre master business administration
pre master business administration

1 beoordeling

Door: Mathias • 1 jaar geleden

Volgen

jaccoverbij Lid sinds 7 jaar 567 documenten verkocht

Chapter 1 Why is my evil lecturer forcing me to learn statistics?
Chapter 1.3 The research process
The research process can be broadly summarized as in the figure below.

Figure: Research process

Chapter 1.5 Generating and testing theories and hypotheses
Theory: an explanation or set of principles that is well substantiated by repeated testing
and explains a broad phenomenon.

Hypothesis: a proposed explanation for a fairly narrow phenomenon or set of
observations (informed, theory-driven attempt).

Falsification: the act of disproving a hypothesis or theory.

Chapter 1.6 Collecting data: measurement
Variables
- Independent variable: a variable thought to be the cause of some effect (the
variable that has been manipulated).
- Dependent variable: a variable thought to be affected by changes in an
independent variable.
- Predictor variable: a variable thought to predict an outcome variable.
- Outcome variable: a variable thought to change as a function of changes in a
predictor variable.

Most hypothesis can be expressed in terms of two variables: a proposed cause and a
proposed outcome.

Levels of measurement
- Categorial variable: a variable made up of categories.
o Binary variable: a categorial variable that names just two distinct types of
things (two categories).
o Nominal variable: same as binary variable, but with more categories.
o Ordinal variable: when categories are ordered.
- Continuous variable: a variable that gives us a score for each person and can take
on any value on the measurement scale that we are using.
o Interval variable: equal intervals on the scale represent equal differences
in the property being used.

, o Ratio variable: has the same requirements as the interval variable,
however as an additional requirement, the ratios of values along the scale
should be meaningful (scale must have a true and meaningless zero point).
A continuous variable can be measured to any level of precision, whereas a discrete
variable can take on only certain values (usually whole numbers) on the scale.

Measurement error: the discrepancy between the numbers we use to represent the thing
we’re measuring and the actual value of the thing we’re measuring.

Validity: whether an instrument measures what it sets out to measure.
- Criterion validity: whether you can establish that an instrument measures what it
claims to measure though comparison to objective criteria (e.g. by relating scores
on your measure to real-world observations).
- Concurrent validity: when data are recorded simultaneously using the new
instrument and existing criteria.
- Predictive validity: when data from the new instrument are used to predict
observations at a later point in time.
- Content validity: the degree to which individual items represent the construct
being measures, and cover the full range of the construct.
Validity is a necessary but not sufficient condition of a measure. To be valid, the
instrument must first be reliable.

Reliability: whether an instrument can be interpreted consistently across different
situations (the ability of the measure to produce the same results under the same
circumstances).
- Test-retest reliability: a reliable instrument will produce similar scores at both
points in time.

Chapter 1.7 Collecting data: research design
In correlational or cross-sectional research, we observe what naturally goes on in the
world without directly interfering with it, whereas in experimental research we
manipulate one variable to see its effect on another.
- This type of research provides a very natural view of the question we’re
researching because we’re not influencing what happens and the measures of the
variables should not be biased by the researcher being there (this is an important
aspect of ecological validity).
- Correlational research tells us nothing about the causal influence of variables.
- In correlational research variables are often measured simultaneously (it provides
no information about the contiguity between different variables).
- Limitation of correlational research is the tertium quid.

Tertium quid: ‘a third person or thing of indeterminate character’.

Confounding variables (confounds): extraneous factors which can influence a correlation.

Longitudinal research: measuring variables repeatedly at different time points.
- Cause: ‘an object precedent and contiguous to another, and where all the objects
resembling the former are placed in like relations of precedency and contiguity to
those objects that resemble the latter’ (Hume).
- The only way to infer causality is through comparing two controlled situations:
one in which the cause is present and one in which the cause is absent (this is the
function of experimental methods; to provide a comparison of situations).
- In experiments there are two ways to manipulate the independent variable:
o By testing different entities (a between-groups, between-subjects, or
independent design).
o By using the same entities (a within-subject or repeated-measures design).

Systematic variation: due to the experimenter doing something in one condition but not
in the other condition.

,Unsystematic variation: due to random factors that exist between the experimental
conditions (e.g. the time of day).
- By keeping the unsystematic variation as small as possible we get a more
sensitive measure of the experimental situation (in this case, randomization is
used).

Randomization is important because it eliminates most other sources of systematic
variation, which allows us to be sure that any systematic variation between experimental
conditions is due to the manipulation of the independent variable. Randomization can be
used in two ways:
- In the repeated measures design
o Practice effects: participants may perform differently in the second
condition because of familiarity with the experimental situation and/or the
measures being used.
o Boredom effects: participants may perform differently in the second
condition because they are tired or bored from having completed the first
condition.
By counterbalancing the order in which a person participates in a condition, we can
ensure that they produce no systematic variation between our conditions.
- The independent design
o To randomly allocate participants to conditions: by doing so you minimize
the risk that groups differ on variables other than the one you want to
manipulate.

Chapter 1.8 Analysing data
Frequency distribution (or: histogram): plotting a graph of how many times each score in
a set of data occurs.

Normal distribution: is characterized by the bell-shaped curve. This shape implies that
the majority of scores lie around the centre of the distribution (the largest bars on the
histogram are around the central value). Also, as we get further away from the centre,
the bars get smaller, implying that as scores start to deviate from the centre their
frequency is decreasing.

A distribution can deviate from normal in two ways:
- Lack of symmetry (skew). These are the most frequent scores; the tall bars on the
graph are clustered at one end of the scale.
o Positively skewed: the frequent scores are clustered at the lower end.
o Negatively skewed: the frequent scores are clustered at the higher end.
- Pointiness (kurtosis). Refers to the degree to which scores cluster at the ends of
the distribution (or: tails) and this tends to express itself in how pointy a
distribution is.
o Positive kurtosis: many scores in the tails (heavy-tailed distribution;
leptokurtic).
o Negative kurtosis: relatively thin in the tails and tends to be flatter than
normal (platykurtic).
In a normal distribution the values of skew and kurtosis are 0 (i.e. the tails of the
distribution are as they should be).

We can calculate where the centre of a frequency distribution lies (or: central tendency).
There are three measures:
- Mode: the score that occurs most frequently in the data set.
o Bimodal: data sets with two modes.
o Multimodal: data sets with more than two modes.
- Median: the middle score, when scores are ranked in order of magnitude.
- Mean: the average score (sum of all scores divided by the number of scores).

Range (or: spread/dispersion): take the largest score and subtract from it the smallest
score.

, - Since it only uses the highest and lowest score, it is affected by extreme scores.
- A way around this is to use the interquartile range (IQR).

The quartiles are the parts that split the data into four equal parts of each 25%. The
second quartile (or: median) splits the data into two equal parts (50% of measurements).
The lower quartile is the median of the lower half; the upper quartile is the median of the
upper half.
- Rule of thumb: the median is not included when the two halves are split, which is
convenient when there is an odd number of values. It is, however, possible to
include it.
- Like the median, if each half of the data had an even number of values in it, then
the upper and lower quartiles would be the average of two values in the data set.
Therefore, the upper and lower quartile need not to be the values that actually
appear in the data.
- The interquartile range is the difference between the upper and the lower
quartile.

Quantiles: values that split a data set into equal portions; quartiles that split the data into
four equal parts.
- Percentiles: points that split the data into 100 equal parts.
- Noniles: points that split the data into nine equal parts.

Figure: Interquartile range (normal distribution)

If we want to use all the data rather than half of it, we can calculate the spread of scores
by looking at how different each score is from the centre of the distribution. If we use the
mean as measure of the centre of distribution, we can calculate the difference between
each score and the mean, known as the deviance.
- Deviance1: the distance of each score from the mean.
- Total deviance2: when added up (all deviances), this equals zero (there are both
the same negative and positive deviances = 0). To overcome this, people tend to
square the deviances (minus * minus equals positive).

Measures of dispersion or spread of data around the mean:
- Sum of squared errors (SS)3: adding up the squared deviances. Often called the
sum of squares. This can be used as an indicator of the total dispersion, or total
deviance of scores from the mean.

1
Equation: 1
2
Equation: 2
3
Equation: 3

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper jaccoverbij. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €2,99. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 67474 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen

Samenvatting

Summary statistics

Document informatie

Onderwerpen

Gekoppeld boek

Meer samenvattingen voor studieboek

Geschreven voor

1 beoordeling

Verkoper

Ontvangen beoordelingen

Voorbeeld van de inhoud

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Snel en makkelijk kopen

Focus op de essentie

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?