What is data?
Data has a fixed structure:
o Variables (properties), in a column
o Units (things/people), in a row. A unit can be experimental or observational.
Levels of measurement
See separate document ‘Levels of measurement’.
Data collection
In qualitative data, you need to motivate and document they way you collected data. There are 3
important questions:
1. Is the sample representative?
This is influenced by the population, random sampling and the generalizability of your
research.
2. Is the data valid?
o Do the data reflect what they should reflect?
o Is the data checked for errors and mistakes (face validity check)?
o If there are multiple people involved in the measurement, did everybody know the
measurement procedure?
o Were there other problems/irregularities during measurement?
It is important that you motivate when you delete something from your sample.
3. Is there measurement error?
A measurement error is the discrepancy between the actual value we are trying to measure,
and the number we use to represent the value. There are two types of measurement errors,
which may happen at the same time:
o Systematic measurement error (also named bias): difference between the average
measurement result and the true value. This is a consistent, often explainable error,
which can be solved by calibration.
o Random measurement error: unsystematic deviations due to imprecision of the
measurement system.
We are more worried about the random error than the systematic error, because we can
calibrate for the systematic error.
Lecture clip 1.3 – Data analysis
Describing data
You usually do not recite an entire dataset when someone asks you what is in it; you summarize it. You
describe the location, dispersion and other properties.
Victor Roos
, Management Research Methods 1 for BA
Location
- Median: the middle score when data is ordered. Field, 5th edition: if you have an odd number
of data, the median is the score that is positioned at the following position: (n + 1)/2. If you
have an even number of data, the median is the score that is position at the following position:
(n + 1)/2 (e.g. 5.5). in this example, the median is (score 5 + score 6)/2.
- Mean: the sum of data divided by the amount of data
Formula: ∑𝑛𝑖=1 𝑥𝑖
Dispersion
- Range: the smallest value subtracted from the largest (very sensitive to outliers)
- Interquartile range: the range of the middle 50% of the data: the upper quartile minus the
lower quartile (less sensitive to outliers)
o Median of the first halve/lower quartile = 25%
o Median of the whole sample/second quartile Median = 50%
o Median of the second halve/upper quartile = 75%
- Variance: the average squared distance between each point and the mean of the data
- Standard deviation: the squared root of the variance. This is in the same unit of measurement
as the original data. A large standard deviation means a wider dispersion and more spread. A
small standard deviation means a narrower dispersion and less spread. Field, 5th edition: the
standard deviation is preferred as this is in the same measurement level as the data.
SPSS: location/dispersion: Analyze – Descriptive statistics – Explore.
SPSS: chart builder: Graphs – Chart Builder – Choose the form – Drag variable(s) in the graph.
SPSS: determine percentages: Explore – Statistics – Percentiles.
Other properties
- Confidence interval: how certain am I of my estimate of the mean. The sample mean is usually
not equal to the population mean. In 95% of the cases you will find that the sample mean is:
𝑠 𝑠
x – 2 𝑛 ≤ μ ≤ x + 2 𝑛
√ √
If n is higher: more certainty
If s is lower: more certainty
- Skew/skewness: the asymmetry of the
distribution. If there is a positive/right
skewed distribution, the scores are
bunched at low values with the tail
pointing to the high values. If there is a
negative skew the scores are bunched at
high values with the tail point to the low
values. Field, 5th edition: You can also
have a positive kurtois (leptokurtic) with
many scores in the tails or a negative
kurtois (platykurtic) which tends to
flatter than normal.
- Mode: the most frequent score. When there is bimodal, there are two modes. When there is
multimodal, there are multiple (≥ 2) modes. When there is multimodal you usually have to split
the population into homogeneous groups.
Victor Roos
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper victorroos32. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €5,74. Je zit daarna nergens aan vast.