100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
CM1005 INTRODUCTION TO STATISTICAL ANALYSIS SUMMARY €6,48   In winkelwagen

Samenvatting

CM1005 INTRODUCTION TO STATISTICAL ANALYSIS SUMMARY

 18 keer bekeken  0 keer verkocht

Good summary including everything you need

Voorbeeld 4 van de 64  pagina's

  • 19 oktober 2023
  • 64
  • 2020/2021
  • Samenvatting
Alle documenten voor dit vak (9)
avatar-seller
isjaterng
INTRODUCTION TO STATISTICAL ANALYSIS

LECTURE WEEK 1
Statistics
Univariate – one variable  What was the average grade of the ISA last year (grade)
Bivariate – two variables  Did males and females differ in their grades?
Multivariate – more than 2 variables  Was the grade dependent on initial motivation, the
time spent on reading and gender?

Statistics – the study of how we describe and make inferences from data

- An inference is “a conclusion reached on the basis of evidence and reasoning” 
make a statement, form the data
- Descriptive statistics – taking direct measurements of your data, describing your data
- Inferential statistics – special technique on the sample, make statements/conclusion
about the entire population  take a sample, smaller big of data from your overall
population and making a larger statement (today and yesterday was cold (data),
inference  tomorrow and the day after is going to be cold too)
- Distinction between descriptive & inferential
 you have to give a descriptive answer and an inferential answer on the exam

Population = N
Sample = n

Units of analysis – the what or who that is being studied (one thing)
- the unit that you will be able to draw conclusions about
- typically, all units are the same type of “thing” on a single data set
- e.g. individuals, families, countries, corporations
- every row can be a different person/country
 rijen/rows

Variable – a measured property of each of the units of analysis
- directly what your data will be
- e.g. age/household income
 kolommen/coloms

“what is my unit of analysis and what kind of variables would be applicable for this kind of
analysis?”

Recap: level of measurement (ISSR)
Nominal level of measurement:
- group classifications
- no meaningful ranking possible (i.e. 3 is not ‘more’ than 2)
- Numerical coding arbitrary (but necessary in SPSS

e.g. what religion do you belong to:


- Atheist (1)
Name Religion
Claire 1
Sebastian 2
Geert 3

,- Protestant (2)
- Catholic (3)
- Muslim (4)
- other (5)


Ordinal level of measurement (order)
- Meaningful ranking/ordering (3 is ‘more’ than 2)
- BUT: distance between categories unknown/not equal (e.g. difference between 1 and
2 not equal to difference between 2 and 3)
 “a horserace without a stopwatch”  you know the winner, but don’t know the
exact time difference

E.g. during an average week, how often do you watch television?
- never (1) Name TV
- once a week (2) Claire 2
- a few days a week (3) Sebastian 1
- daily (4) Geert 3
Interval level of measurement
- meaningful ranking (17 is more than 16)
- Distances are equal (e.g. difference between 15 and 17 is equal to difference
between 20 and 22)

e.g. temperature in degrees Celsius

Day Degrees Celsius (at noon)
Monday 17
Tuesday 16
Wednesday 14

Ratio level of measurement
- All properties of interval (ranking & equal distances)
- Absolute & meaningful zero point

e.g. a person’s age
Name Age
Claire 28
Sebastian 22
Geert 34

Qualitative – nominal and ordinal
Quantitative – interval and ratio
 we always first need to know the level of measurement in order to know which statistical
techniques we may use for the given variable(s)

Continuous variable – measured along a continuum  number can have a decimal point
(5.457)  person’s height

,Discrete variable – measured in whole units or categories  whole, not a decimal point  a
person’s number of children

Measures of central tendency - to (univariately, using one measure) describe the
distribution (staafdiagram, waar het verschil te zien is) of variables on several different levels
of measurements (not all of them)

MEASURE OF CENTRAL TENDENCY: THE MEAN (INTERVAL/RATIO)

Measuring trust in the news media (on a 11-point scale, 0 = no trust, 10 = complete trust)
10 respondents in our sample (n=10)
 what is the average (mean) trust in the news media in this sample?

Σx
Mean = M  formula for the mean M=
n
E = sigma
X = variable (the different values a variables can have)
Ex = sum up the variables
2+3+5
Thus = M = = 3 1/3  all values are added up and divided by n, i.e. the number of
3
observations in the sample

Characteristics of the mean:
- Changing any score will change the mean
- Adding or removing a score will change the mean (unless that score is already equal
to the mean)
- Adding, subtracting, multiplying, dividing each score by a given value causes the
mean to change accordingly
- Sum of differences from the mean is zero  Σ ( x −M )=0  subtract the mean from
all values and than add them up = 0
- Sum of squared differences from the mean is minimal  Σ ( x −M )=minimum
 subtract the means from the values and subsequently square the answers
 this value is also called sum of squares (SS)
 a larger SS means that scores deviate more from the mean
WHY ‘MINIMAL”? – if we had used any other value than the mean to calculate the SS,
it would have been larger than 42, it’s the minimum  lowest possible value



Σx
μ=
N Population mean


MEASURE OF CENTRAL TENDENCY: THE MEDIAN (ORDINAL &
INTERVAL/RATIO)
To find the median:
1. Sort all cases based on their value on x
2. The value of the “middle case” equals the median (equal amount of cases below and
above)

, - Whenever n is an even number, the median is the mean value of the two middle
cases (e.g. 7, median is average of 3 + 4 = 3.5)

Note that the median is not as sensitive to outliers as the mean

Outlier – value that is far away from the rest of our distribution, a lot higher or lower  it
sticks out
- Can ruin the interpretate value of the mean
- Use median  the median seems more useful here (not sensitive to outlier)

Frequency – how many of each thing

To determine the median from a frequency table, we need to identify the first category that
exceeds ‘50% in the cumulative percent’ column (ORDINAL)

Cumulative percent = frequency : total

MEASURE OF CENTRAL TENDENCY: THE MODE (NOMINAL, ORDINAL, INTERVAL/RATIO)
Mode - the category with the largest amount of cases  which one has the highest
frequency, look at the highest percentage in the frequency table

Distribution – shows you the counts of units of analysis within each of the values that they
can take on


Normal distribution:
- Symmetric  looks the same on both sides
- Mean, median and mode are the same value




Skewed distribution
- Left: the mean is shifted to the left in a negatively skewed distribution
- Right: the mean is shifted to the right in a positively skewed distribution
 look at the way it stretches out (left/right)

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper isjaterng. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €6,48. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 67866 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen
€6,48
  • (0)
  Kopen