What is data?
Data has a fixed structure:
o Variables (properties), in a column
o Units (things/people), in a row. A unit can be experimental or observational.
Levels of measurement
See separate document ‘Levels of measurement’.
Data collection
In qualitative data, you need to motivate and document they way you collected data. There are 3
important questions:
1. Is the sample representative?
This is influenced by the population, random sampling and the generalizability of your
research.
2. Is the data valid?
o Do the data reflect what they should reflect?
o Is the data checked for errors and mistakes (face validity check)?
o If there are multiple people involved in the measurement, did everybody know the
measurement procedure?
o Were there other problems/irregularities during measurement?
It is important that you motivate when you delete something from your sample.
3. Is there measurement error?
A measurement error is the discrepancy between the actual value we are trying to measure,
and the number we use to represent the value. There are two types of measurement errors,
which may happen at the same time:
o Systematic measurement error (also named bias): difference between the average
measurement result and the true value. This is a consistent, often explainable error,
which can be solved by calibration.
o Random measurement error: unsystematic deviations due to imprecision of the
measurement system.
We are more worried about the random error than the systematic error, because we can
calibrate for the systematic error.
Lecture clip 1.3 – Data analysis
Describing data
You usually do not recite an entire dataset when someone asks you what is in it; you summarize it. You
describe the location, dispersion and other properties.
Victor Roos
, Management Research Methods 1 for BA
Location
- Median: the middle score when data is ordered. Field, 5th edition: if you have an odd number
of data, the median is the score that is positioned at the following position: (n + 1)/2. If you
have an even number of data, the median is the score that is position at the following position:
(n + 1)/2 (e.g. 5.5). in this example, the median is (score 5 + score 6)/2.
- Mean: the sum of data divided by the amount of data
Formula: ∑𝑛𝑖=1 𝑥𝑖
Dispersion
- Range: the smallest value subtracted from the largest (very sensitive to outliers)
- Interquartile range: the range of the middle 50% of the data: the upper quartile minus the
lower quartile (less sensitive to outliers)
o Median of the first halve/lower quartile = 25%
o Median of the whole sample/second quartile Median = 50%
o Median of the second halve/upper quartile = 75%
- Variance: the average squared distance between each point and the mean of the data
- Standard deviation: the squared root of the variance. This is in the same unit of measurement
as the original data. A large standard deviation means a wider dispersion and more spread. A
small standard deviation means a narrower dispersion and less spread. Field, 5th edition: the
standard deviation is preferred as this is in the same measurement level as the data.
SPSS: location/dispersion: Analyze – Descriptive statistics – Explore.
SPSS: chart builder: Graphs – Chart Builder – Choose the form – Drag variable(s) in the graph.
SPSS: determine percentages: Explore – Statistics – Percentiles.
Other properties
- Confidence interval: how certain am I of my estimate of the mean. The sample mean is usually
not equal to the population mean. In 95% of the cases you will find that the sample mean is:
𝑠 𝑠
x – 2 𝑛 ≤ μ ≤ x + 2 𝑛
√ √
If n is higher: more certainty
If s is lower: more certainty
- Skew/skewness: the asymmetry of the
distribution. If there is a positive/right
skewed distribution, the scores are
bunched at low values with the tail
pointing to the high values. If there is a
negative skew the scores are bunched at
high values with the tail point to the low
values. Field, 5th edition: You can also
have a positive kurtois (leptokurtic) with
many scores in the tails or a negative
kurtois (platykurtic) which tends to
flatter than normal.
- Mode: the most frequent score. When there is bimodal, there are two modes. When there is
multimodal, there are multiple (≥ 2) modes. When there is multimodal you usually have to split
the population into homogeneous groups.
Victor Roos
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller victorroos32. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $6.12. You're not tied to anything after your purchase.