Introduction 2
Chapter 2: Summarising data 3
2.1 What is data? 3
2.2 summarizing data 4
2.2.1 summarizing univariate data 5
2.2.2 summarizing bivariate data 7
2.2.3 Summarizing multivariate data 9
Exploring distributions 10
3.1 The quantile function and location-scale families 10
3.2 QQ-plots 11
3.3 symplots 14
3.4 Goodness of fit tests 15
3.4.1 Shapiro-WilK Test 15
3.4.2 Kolmogorov-Smirnov test 16
3.4.3. Chi-Square tests (𝒙𝟐) 18
Density estimation 19
4.1 Kernel density Estimators 19
4.2 Choice of kernel and bandwidth 20
4.3 Cross-validation 24
4.4 other density estimators 26
4.5 multivariate denisty estimation 27
The bootstrap 29
5.1 simulation 29
5.2 Bootstrap estimators for a distribution 30
Imperical and Parametric Bootstrap estimators 31
5.2.2 Bootstrap in practice 31
5.3 bootstrap confidence intervals 32
5.4 Bootstrap tests 33
5.5 Limitations of the bootstrap 34
,CHAPTER 1
INTRODUCTION
Statistics is collecting, analysing and interpreting data. It is present in many things, like industry,
polls, medical studies, scientific research, terrorism, ice forecast et cetera.
If you have a statistical study, there are a few steps to undertake:
1. Research question
2. Experimental design
3. Data collection
4. Data analysis
5. Interpretation of results
6. Presentation of results and conclusion
This course is all about giving theoretical and practical insight in the last 3 stages.
In each statistical study we need a statistical model.
1. Data analysis
a. get an impression of data,
b. validate statistical model.
c. summarize data (descriptive statistics)
d. analyse (e.g., estimate/test parameters in model)
2. Interpretation of results
a. this is not always straightforward.
3. Presentation of results and conclusion
a. translate back to the experimental context.
Interpretation of results and presentation of results and conclusion are practised weekly
in the assignments. Make neat and concise report. Reports do not have to be very graphical;
do not make front page and such, because they are unnecessary.
, CHAPTER 2
CHAPTER 2: SUMMARISING DATA
2.1 WHAT IS DATA?
The term ‘data’ is often used without taking the time to properly define it. In the most general
sense, ‘data’ are the quantified results of a study. Let us look more closely at what kinds of
different data there are, and on which measure scale they live.
Definition 2.1.1. (Measurement scales). There exist three different measurement scales
(different types of data):
(1) Nominal Scale (or nominal level): Results are qualitative, i.e., they live on a qualitative
scale. More simply, the results can be divided up in two categories. For example, if we
research whether there are more males or females in a certain area, a data-point could
be either ‘male’ or ‘female’.
• Note: Location, spread, mean, median have no meaning when it comes to the
nominal scale.
(2) Ordinal Scale: The categories can be ordered.
• Note: The measure of spread, distance between categories have no meaning
in this scale.
(3) Quantitative Scale: Measurements whose meaning is more than just falling in a
category. These results are typically represented as real numbers (or higher dimensional
variations)
Example 2.1.2. For every country the % military expenditure of GDP, the following
characteristics are given:
• Entity: Afghanistan – Nominal Scale
• human development index ranking (UN):170 – Ordinal Scale
• military_expenditure_share_gdp(rounded)) – Quantitative Scale
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller cedm9. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $16.71. You're not tied to anything after your purchase.