Introduction 2
Chapter 2: Summarising data 3
2.1 What is data? 3
2.2 summarizing data 4
2.2.1 summarizing univariate data 5
2.2.2 summarizing bivariate data 7
2.2.3 Summarizing multivariate data 9
Exploring distributions 10
3.1 The quantile function and location-scale families 10
3.2 QQ-plots 11
3.3 symplots 14
3.4 Goodness of fit tests 15
3.4.1 Shapiro-WilK Test 15
3.4.2 Kolmogorov-Smirnov test 16
3.4.3. Chi-Square tests (𝒙𝟐) 18
Density estimation 19
4.1 Kernel density Estimators 19
4.2 Choice of kernel and bandwidth 20
4.3 Cross-validation 24
4.4 other density estimators 26
4.5 multivariate denisty estimation 27
The bootstrap 29
5.1 simulation 29
5.2 Bootstrap estimators for a distribution 30
Imperical and Parametric Bootstrap estimators 31
5.2.2 Bootstrap in practice 31
5.3 bootstrap confidence intervals 32
5.4 Bootstrap tests 33
5.5 Limitations of the bootstrap 34
,CHAPTER 1
INTRODUCTION
Statistics is collecting, analysing and interpreting data. It is present in many things, like industry,
polls, medical studies, scientific research, terrorism, ice forecast et cetera.
If you have a statistical study, there are a few steps to undertake:
1. Research question
2. Experimental design
3. Data collection
4. Data analysis
5. Interpretation of results
6. Presentation of results and conclusion
This course is all about giving theoretical and practical insight in the last 3 stages.
In each statistical study we need a statistical model.
1. Data analysis
a. get an impression of data,
b. validate statistical model.
c. summarize data (descriptive statistics)
d. analyse (e.g., estimate/test parameters in model)
2. Interpretation of results
a. this is not always straightforward.
3. Presentation of results and conclusion
a. translate back to the experimental context.
Interpretation of results and presentation of results and conclusion are practised weekly
in the assignments. Make neat and concise report. Reports do not have to be very graphical;
do not make front page and such, because they are unnecessary.
, CHAPTER 2
CHAPTER 2: SUMMARISING DATA
2.1 WHAT IS DATA?
The term ‘data’ is often used without taking the time to properly define it. In the most general
sense, ‘data’ are the quantified results of a study. Let us look more closely at what kinds of
different data there are, and on which measure scale they live.
Definition 2.1.1. (Measurement scales). There exist three different measurement scales
(different types of data):
(1) Nominal Scale (or nominal level): Results are qualitative, i.e., they live on a qualitative
scale. More simply, the results can be divided up in two categories. For example, if we
research whether there are more males or females in a certain area, a data-point could
be either ‘male’ or ‘female’.
• Note: Location, spread, mean, median have no meaning when it comes to the
nominal scale.
(2) Ordinal Scale: The categories can be ordered.
• Note: The measure of spread, distance between categories have no meaning
in this scale.
(3) Quantitative Scale: Measurements whose meaning is more than just falling in a
category. These results are typically represented as real numbers (or higher dimensional
variations)
Example 2.1.2. For every country the % military expenditure of GDP, the following
characteristics are given:
• Entity: Afghanistan – Nominal Scale
• human development index ranking (UN):170 – Ordinal Scale
• military_expenditure_share_gdp(rounded)) – Quantitative Scale
Voordelen van het kopen van samenvattingen bij Stuvia op een rij:
Verzekerd van kwaliteit door reviews
Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!
Snel en makkelijk kopen
Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.
Focus op de essentie
Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!
Veelgestelde vragen
Wat krijg ik als ik dit document koop?
Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.
Tevredenheidsgarantie: hoe werkt dat?
Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.
Van wie koop ik deze samenvatting?
Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper cedm9. Stuvia faciliteert de betaling aan de verkoper.
Zit ik meteen vast aan een abonnement?
Nee, je koopt alleen deze samenvatting voor €15,58. Je zit daarna nergens aan vast.