College aantekeningen

Applied Data Analysis (ADA) lectures written out / college samenvatting (english) 85 p.

Name: Applied Data Analysis (ADA) lectures written out / college samenvatting (english) 85 p.
SKU: doc_1175914
Rating: 2.50 (2 reviews)
Author: claire225

2 beoordelingen

13 keer verkocht

Vak
Applied Data Analysis

Instelling
Universiteit Leiden (UL)

This 85 pages file contains all 8 ADA lectures completely written out. The document is in English.

[Meer zien]

Voorbeeld 4 van de 85 pagina's

Bekijk voorbeeld

Geupload op 19 juni 2021
Aantal pagina's 85
Geschreven in 2020/2021
Type College aantekeningen
Docent(en) Peter de heus
Bevat Alle colleges

2 beoordelingen

Door: leeruil123 • 3 jaar geleden

Door: ronettovtilburg • 3 jaar geleden

Volgen

claire225 Lid sinds 3 jaar 40 documenten verkocht

Applied Data Analysis
Tentamen 16 juni 2021

College 1 : EXPLORING DATA

Literatuur: Field, A. (2018). Discovering statistics using IBM SPSS Statistics (5th edition).

Chapter 2 (§ 2.1 – 2.10)
Chapter 3 (§ 3.1 – 3.7)
Chapter 5 (§ 5.1 – 5.9)
Chapter 6 (p. 243-252, 268-276)
Chapter 19 (§ 19.1 – 19.3.6, 19.7)

In this lecture we will learn how we can explore our data.

Why explore?

Generally, research is (and should be) hypothetical-deductive.
This means we should:
- First formulate a hypothesis (on theoretical grounds) and deduce which pattern of
results should follow from it.
- Then, collect data to test whether these hypotheses apply (hypothesis are always
about the population!).

Usually, this leads to a focused prediction (e.g., females have higher social skills score than
males: µf > µm. In a social skills test, females should score higher than males).

Two reasons to explore our data:

1. We do not want to limit ourselves to only our main prediction! Sometimes, unexpected
results are the most interesting ones (isn’t science about finding out new things)
2. Almost always, we need to check assumptions of hypothesis tests.

Main steps in data analysis

1. Explore. Look what’s in your data.
2. Check assumptions. Significance tests make assumptions about the data, but do they
apply in your case? (and if violated, what has to be done?)
3. Hypothesis testing. Determine if a predicted relationship exists in the sample (e.g. a
correlation between two variables) and if it can be generalized from sample to
population.
4. Interpretation. Analyze the nature of the relationships between variables.
5. Write. Report your results (following APA rules). Preliminary step. Decide which
technique is most suitable for your research question.

1

,Preliminary step: Decide which technique is most suitable for your research question

Exploring frequency distributions

Two basic ways of exploration
1. Make pictures (histograms, boxplots)
2. Compute statistics (mean, median, mode, variance, standard deviation, skewness,
kurtosis, Kolgomorov-Smirnov test). We will do both, with emphasis on normality
Remark. Very often the normality assumption is not as important as suggested by Field,
because many tests are robust against violation of this assumption.

Histogram

Histogram. Picture of a frequency distribution (categories on X-axis, numbers of individuals
on Y-axis).
Normality at first sight. From left to right more deviation from normality.
(middle and right histogram are positively skewed. Most clinical scored are positively
skewed, because most people have low scores on for example depression)

Boxplot

Boxplot is exclusively defined in terms of percentiles. Not in means and standard deviations!

2

,Boxplot uitleg:

Median: middelste score
75th percentile: 75% van de scores ligt onder dit getal
25th percentile: 25% van de scores ligt onder dit getal
Sticks: either 1,5 times de box height of de laagste/hoogste score. In dit plaatje is de onderste stick de
laagste score en de bovenste stick 1,5 x box height.
Outliers: - Scores die verder dan 1,5 x de box height (sticks) aangegeven met cirkels
- Scores die verder dan 3x de box height (sticks) aangegeven met sterretjes

Warning. Boxplots are based on percentiles (median is 50th percentile). They do not
necessarily give the same results as measures based on means and variances.

No perfect normality or symmetry in previous boxplot, but it can be much worse. Look at
anxiety boxplot.

- Very positively skewed distribution.
- Most people are low on anxiety: more than 25 %
has lowest possible score (→ 25th percentile =
lowest score → no “stick” under box)
- A lot of outliers and extreme scores.

No lower stick means that more than 25% of the
lower scores have exactly the same score

Use boxplots to compare different variables, or to compare different groups on same variable (here:
occupation)

3

, Boxplots for different variables are only useful when variables have comparable measuring
scales. DON’T DO THIS! These boxplots are very ugly together because the variables have
different scales. (0-3 and 0-15)

Skewness and kurtosis

Skewness: measure of asymmetry of the distribution.
- perfect symmetry → skewness = 0;
- long tail of distribution to the right → skewness > 0;
- long tail of distribution to the left → skewness < 0.

Normal distribution is always 0 skew. But 0 skew does not mean
normal per se.

Kurtosis: measure of “peakedness” of a distribution (actually whether a distribution is more or
less “peaked” than you would expect on the basis of the standard deviation and the normality
assumption).
- Perfectly normal distribution → kurtosis = 0 (but kurtosis = 0 does not necessarily
imply normal distribution).
- Peak higher than normal → kurtosis > 0 (red: more scores in the middle and in the
tails).
- Peak lower than normal (i.e. distribution to flat) → kurtosis < 0 (green: more
scores between the middle and the tails).

Attention! Positive kurtosis does not only mean a higher peak! It also means more scores
in the tails. Only a higher peak does not have to mean positive kurtosis, but could also
mean a low standard deviation.

4

Dit zijn jouw voordelen als je samenvattingen koopt bij Stuvia:

Bewezen kwaliteit door reviews

Studenten hebben al meer dan 850.000 samenvattingen beoordeeld. Zo weet jij zeker dat je de beste keuze maakt!

In een paar klikken geregeld

Geen gedoe — betaal gewoon eenmalig met iDeal, creditcard of je Stuvia-tegoed en je bent klaar. Geen abonnement nodig.

Direct to-the-point

Studenten maken samenvattingen voor studenten. Dat betekent: actuele inhoud waar jij écht wat aan hebt. Geen overbodige details!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper claire225. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €10,49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 63950 samenvattingen verkocht

Opgericht in 2010, al 15 jaar dé plek om samenvattingen te kopen

Begin nu gratis

Laatst bekeken door jou

Tentamen (uitwerkingen) ·

(1)

2023 CLC 056 /CLC 056 Analyzing Contract Costs Exam QUESTIONS AND ANSWERS GRADED A

Voordeelbundel ·

(0)

NHS Pathways/NHS 111 EXAM BUNDLE with Complete Questions and Answers 100% CORRECT

Tentamen (uitwerkingen) ·

(1)

WCU PATHOPHYSIOLOGY 370 FINAL EXAM QUESTIONS AND CORRECT DETAILED ANSWERS AGRADE.

Tentamen (uitwerkingen) ·

(19)

ISC3701 Assignment 4 PORTFOLIO (COMPLETE ANSWERS) 2024 (543545)- DUE 12 September 2024

Tentamen (uitwerkingen) ·

(0)

IB BIOLOGY - MOLECULAR BIOLOGY (TOPIC 2 SL) QUESTIONS AND ANSWERS WITH SOLUTIONS 2024

Tentamen (uitwerkingen) ·

(0)

RC-MCCC Phase 3 Comprehensive Exam Questions with Verified Solutions Latest Update 2024

Tentamen (uitwerkingen) ·

(1)

2023 JULY CBSPD FINAL EXAM LATEST 2023-2024 REAL EXAM 100+ QUESTIONS AND CORRECT ANSWERS

Tentamen (uitwerkingen) ·

(1)

RN ATI Concept-Based Assessment, Proctored Exam for Level 4|Test Bank| Graded A 2023

Tentamen (uitwerkingen) ·

(1)

College aantekeningen

Applied Data Analysis (ADA) lectures written out / college samenvatting (english) 85 p.

Document informatie

Onderwerpen

Gekoppeld boek

Meer samenvattingen voor studieboek

Geschreven voor

2 beoordelingen

Verkoper

Ontvangen beoordelingen

Voorbeeld van de inhoud

Dit zijn jouw voordelen als je samenvattingen koopt bij Stuvia:

Bewezen kwaliteit door reviews

In een paar klikken geregeld

Direct to-the-point

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?

Laatst bekeken door jou

Tentamen (uitwerkingen) ·

2023 CLC 056 /CLC 056 Analyzing Contract Costs Exam QUESTIONS AND ANSWERS GRADED A

Voordeelbundel ·

NHS Pathways/NHS 111 EXAM BUNDLE with Complete Questions and Answers 100% CORRECT

Tentamen (uitwerkingen) ·

WCU PATHOPHYSIOLOGY 370 FINAL EXAM QUESTIONS AND CORRECT DETAILED ANSWERS AGRADE.

Tentamen (uitwerkingen) ·

ISC3701 Assignment 4 PORTFOLIO (COMPLETE ANSWERS) 2024 (543545)- DUE 12 September 2024

Tentamen (uitwerkingen) ·

IB BIOLOGY - MOLECULAR BIOLOGY (TOPIC 2 SL) QUESTIONS AND ANSWERS WITH SOLUTIONS 2024

Tentamen (uitwerkingen) ·

RC-MCCC Phase 3 Comprehensive Exam Questions with Verified Solutions Latest Update 2024

Tentamen (uitwerkingen) ·

2023 JULY CBSPD FINAL EXAM LATEST 2023-2024 REAL EXAM 100+ QUESTIONS AND CORRECT ANSWERS

Tentamen (uitwerkingen) ·

RN ATI Concept-Based Assessment, Proctored Exam for Level 4|Test Bank| Graded A 2023

Tentamen (uitwerkingen) ·

MED SURG RN HESI EXIT EXAM 2024 Elaborated Questions And Correct Answers Graded A