100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
Note and StudyGuide of Statistics for Premasters DSS €5,96
In winkelwagen

College aantekeningen

Note and StudyGuide of Statistics for Premasters DSS

 44 keer bekeken  1 keer verkocht

An detailed, well-structured summary including all the course materials: class slides, final exam example questions, quiz questions posted by the professors, with clear chart and beautiful, neat layout. This is for the course "Statistics for Premasters DSS" at Tilburg University which is part o...

[Meer zien]
Laatste update van het document: 2 weken geleden

Voorbeeld 8 van de 22  pagina's

  • 5 december 2024
  • 19 december 2024
  • 22
  • 2024/2025
  • College aantekeningen
  • Eriko fukuda, sasha kenjeeva
  • Alle colleges
Alle documenten voor dit vak (1)
avatar-seller
AliceOuterspace
Statistics Pre DSS (24fall)
1



Notes & Study Guide




Green Text: code in R

Red Text: differences worth no ng

With sign: very important key points
(referring to given quiz & exam sample ques ons)

To pass or get a good score in final exam, it is strongly
recommended to thoroughly engage with the material
and gain a deep understanding of the concepts and
terms, rather than simply memorizing key points.

Any ques on, please email to: aliceouterspace@gmail.com

Version: 202412190011
By: Alice

,Content
Research Methods/Terms .................................................................................................................................................... 3

Different Plot in R Part 1 ....................................................................................................................................................... 4

Different Plot in R Part 2 ....................................................................................................................................................... 5
2

Measures of Data ..................................................................................................................................................................... 6

Population and Sample ......................................................................................................................................................... 7

Hypothesis Testing .................................................................................................................................................................. 8

Z Test / P value / Confidence Intervals ........................................................................................................................... 9

Categorical Variable and Pearson Chi-squared test............................................................................................... 10

Continuous Variable and T test ....................................................................................................................................... 11

Paired T test and One-Sided T-Test ............................................................................................................................... 12

One-Way ANOVA .................................................................................................................................................................. 13

F distribution & run ANOVA in R .................................................................................................................................... 14

Effect Size & Further test of One-Way ANOVA ........................................................................................................ 15

Assumptions of One-Way ANOVA & Factorial ANOVA ....................................................................................... 16

Two-Way / Factorial ANOVA ............................................................................................................................................ 17

Two-Way / Factorial ANOVA in R and Affect Size ................................................................................................... 18

Appendix 1:Basics of R ..................................................................................................................................................... 19

Appendix 2:Basic Operation of R ................................................................................................................................. 20

Appendix 3:Data Graphing in R .................................................................................................................................... 21

Appendix 4:Tests Function in R .................................................................................................................................... 22

, Research Methods/Terms

Types of research
Correlation Observing what naturally goes on in world without directly interfering with it.
Cross-sectional data from people at different age
=> (quasi-experimental, case study, naturalistic observation)
Experimental one or more variables is systematically manipulated to see their effect
=> (cause and effect statement, random sampling)


Type of reliability ability of measure or produce same results under same condition
Test-retest same entities + two different points in time = consistent result
Inter-rater across people = same answer 3
Parallel forms different measures for same thing, result should be same
(eg. four different bathroom scales to measure participants' weight)
Internal consistency whether measurement actually measures it


Type of validity
Internal (the extent) causal relationship of variables can draw correct conclusion
(In an experiment testing a new drug, internal validity ensures that changes in
health outcomes are due to the drug and not other factors like diet or exercise. )
External (the extent) same pattern in real life


Construct whether you are actually measures what you want to measure
Face whether a measure “looks like” it is doing what it supposed to do
( math exams has questions about arithmetic will have high face validity, while if it
has history-related questions, then low face validity )
Ecological whether set up of study = real world scenario, it often comes with practical,
actionable insight outside the research setting.
( memory study on a quiet, controlled lab will lack of ecological validity )


Confounds unmeasured variable that is interested, what threatens internal validity.
Artefacts what threatens the external validity or construct validity of results
=> (movement noise in an EEF signal)

dependent variable (DV) “to be explained/outcome
( study testing the effect of sunlight on plant growth, the plant growth, measured
in height, number of leaves, or weight. )
independent variable (IV) “to do the explaining” / predictor
(In the same study, amount of sunlight)
check Two-Way ANOVA section




SUMMARY

, Different Plot in R Part 1

What are the different types of plots?




Histograms
- identify the shape of distribution
- show skew and kurtosis


(eg. visualize the shape of the
distribution of weight for people
in a weight loss program)
4




Scatter & Line


- display the relationship


(eg. visualize the relation between
amount of sleep and
the level of grumpiness)




SUMMARY

, Different Plot in R Part 2


Box
- depicts median, IQR and range
- to detect outliers




5




Bar


- shows mean score
- error bar displays one of following:
1. confidence interval (usually 95%)
2. standard deviation
3. standard error of the mean




- to compare discrete categories,
therefore, especially for categorical
(ordinal/nominal/binary) data
(eg. visualize the relative frequency
of various ethnicities represented
at an IT company)




SUMMARY

, Measures of Data

mean central of gravity
(for interval and ratio scale data, but sensitive to extreme value)


median middle value : (𝑛 + 1)/2
(for ordinal scale data or interval and ratio scale data, less affected by outliers)


mode frequency (for nominal scale data)
range max − 𝑚𝑖𝑛
percentiles Q2 = 50% = median (Q1 = 25% | Q3 = 75%)
6
Interquartile Range IQR 𝐼𝑄𝑅 = 𝑄 − 𝑄
=> excluding extreme values/outliners ( resistant to outliers )
How to calculate outliners? < Q1-1.5 * IQR or > Q3 + 1.5 * IQR
Skew left/negative-skewed: mean < median
(-1, 1) right/positive-skewed: mean > median
(the direction of the tail)
(Negative numbers are located to
the left of zero on the number line,
and positive numbers are to right)




Kurtosis <0 too flat (platykurtic) => has fewer extreme values, fatter tails
(-2, 2) =0 normal distribution (mesokurtic)
>0 too pointy (leptokurtic) => has more extreme value, lighter tails

Deviation 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 = 𝑿𝒊 − 𝑿

Sum of squared errors (SS) 𝑺𝑺 = ∑𝑵
𝒊 𝟏(𝑿𝒊 − 𝑿)
𝟐



𝑺𝑺
Variances(s2) 𝒔𝟐 = 𝑵 𝟏
(variance is always biased, ≤ true variance)


Standard deviation(s) or (sd) 𝒔 = √𝒔𝟐
how well mean represents the data => large sd: more spread out, small sd: more central to mean
𝐒
Standard error => to quantify how reliable it is, we do that in terms of standard error
√𝐍


The purpose of descriptive statistics is to characterize the data we collected without attempting to understand a
population.


Report descriptives
mean(M), SD, sample size, description characteristics (skewness, kurtosis and SE)

, Population and Sample
Key Idea There is always a discrepancy between sample mean and popula on mean
=> test based on sample is not always reliable, may lead to wrong conclusion


Statistical Model 𝒐𝒖𝒕𝒄𝒐𝒎𝒆𝒊 = (𝒎𝒐𝒅𝒆𝒍/𝒎𝒆𝒂𝒏/𝑿) + 𝒆𝒓𝒓𝒐𝒓𝒊

#Almost never known mu/µ true popula on mean
σ popula on standard devia on
∑ ( µ)
popula on variance 𝜎 =
7
unbiased estimate of variance 𝜎 = ∑ (𝑋 − 𝑋)


n sample size
𝒙 mean of sample
s standard devia on of sample

sample variance 𝑠 = ∑ (𝑥 − 𝑥̅ )
(∑ )
∑ ( ̅) ∑
unbiased sample variance 𝑠 = ( )
= ( )


R provides estimates of the population and not the sample statistics
µ es mate of popula on mean = sample mean = hypothesis popula on mean µ0

Central Limit Theorem 1. mean of sample (𝑥̅ ) = mean of the population (µ)
2. standard error (variability, SE𝑥̅ ) of sample distribution
gets smaller as the sample size (N) increases
3. the shape of the sample distribution
becomes normal as the sample size increases
=> larger samples are more reliable




SUMMARY

, Hypothesis Testing

What is the Goal of hypothesis? to rule out the chance (sample error) as a plausible explana on for the result

What are the Steps:
1. Null Hypothesis H0: a claim of no difference in the popula on (or that an effect is zero)
(MUST before the experiment)

alterna ve hypothesis·Ha Actual Research Aim: H0 is false
1.1 select an α level 1. “cut off” for decision on null
normal 0.05 2. type I error (possible)
3. How certain we want to be when rejec ng a hypothesis
8
also, the threshold used for significance

2. locate cri cal region 1. outcomes that are very unlikely to occur if null hypothesis is true
2. sample means that are not likely to occur if variable actually has no effect

3. compute test sta s c a ra o: compare the obtained differences between the sample mean and the
hypothesized popula on mean with the amount of difference we would
expect without any treatment effect (the standard error)

𝒆𝒔𝒕𝒊𝒎𝒂𝒕𝒆 − 𝒗𝒂𝒍𝒖𝒆 𝒘𝒆 𝒉𝒚𝒑𝒐𝒕𝒉𝒆𝒔𝒊𝒛𝒆
𝒕𝒆𝒔𝒕 𝒔𝒕𝒂𝒕𝒊𝒔𝒕𝒊𝒄 =
𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒆𝒓𝒓𝒐𝒓

4. whether to reject null hypothesis if: test sta s c = large value => obtained mean difference more than expected
if: large enough in cri cal region => the difference is significant => reject the null
if: test sta s c = rela vely small => the difference is not sufficient => fail reject




SUMMARY

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper AliceOuterspace. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €5,96. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 51662 samenvattingen verkocht

Opgericht in 2010, al 15 jaar dé plek om samenvattingen te kopen

Start met verkopen
€5,96  1x  verkocht
  • (0)
In winkelwagen
Toegevoegd