100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
Samenvatting Statistics II For IB International Business €7,49   In winkelwagen

Samenvatting

Samenvatting Statistics II For IB International Business

 16 keer bekeken  2 keer verkocht

Samenvatting Statistics II For IB International Business

Voorbeeld 4 van de 54  pagina's

  • 21 november 2022
  • 54
  • 2022/2023
  • Samenvatting
Alle documenten voor dit vak (3)
avatar-seller
minjonhoogvliets
Statistics II - IB
Lecture 1 – Introduction and hypothesis testing
Objectives of the course:
- Knowledge about multivariate statistics: hypothesis testing,
multivariate regression analysis, analysis of variance, time-series
analysis
- Skills for performing multivariate statistical analysis: use of SPSS

Introduction to multivariate statistical analysis
Types of data
Nonmetric or qualitative data
- Presence of a feature, male/female, vegetarian yes/no?

Metric or quantitative
- Quantifying an attribute, how tall is the individual/how satisfied?




Measurement scales
- Nominal scale: numbers in place of labels, male/female
- Ordinal scale: ranking
- Interval scale: with no ‘zero’ reference point: Celsius
- Ratio scale: with ‘zero’ reference point: Height

Missing value analysis
What are missing data? For an individual we have only partial information
(we know the values of only some of its characteristics).
The goal of the analysis is to identify the true patterns and relationships
among variables even when some data are missing. Impact:
- Reduces the sample size
- Can distort results: is it a systematic of random data deficiency?

Types of missing data:
Missing completely at random MCAR: for any respondent, the probability
that the value of a variable is missing does not depend on any variable.
Unsystematic missingness.

,Missing at random MAR: for any respondent, the probability that the value
of a variable is missing depends on other variables. E.g., probability of
missing data is related to age.

https://iriseekhout.shinyapps.io/MissingMechanisms/

How to analyse the missing values?
- Check in each variable the percentage of missing values and the
number of extremes and outliers.
- Check in each observation the percentage of missing values and
how often it is an extreme or outlier (also, to what extent)
- Check how often the missing patterns occur: frequent patterns might
indicate causality. Which cases present these missing patterns?

How to handle the missing values?
- Ignore: if it is less than 10% of cases/variables
- Deletion: pairwise or listwise
- Imputation: mean, hot deck imputation, cold deck imputation

Deletion
Listwise: delete entire observation. The advantage is that the remaining
dataset is complete. A disadvantage: the reduced resulting sample size
due to the loss of the incomplete cases, biased dataset if not MCAR.
Pairwise: delete incomplete cases on an analysis-by-analysis basis (delete
from the calculation). Sample size remain the same for some analyses,
reduced for others. Disadvantage is the inconsistency of the sample size.

Imputation
The mean of entire data or group. Creates reduced variability.
Hot deck imputation: use an observation from the sample that is
considered similar.
Cold deck imputation: use an observation from an external data source
that is considered similar.

Rules of thumb to handle missing data:
< 10%: ignore or use any imputation method
10-20%: hot deck imputation (assuming MCAR)
>20%: delete

Examining data
Why should be examine data carefully? To prevent from jumping to wrong
conclusions. Understand the type of data to answer the following
questions:
- What are the characteristics of the data?
- Is there a common behaviour to all the data?
- Is there any missing data?
- Is there any outlier?
- Which analysis methods can we use?

,We should detect the major features of the probability distribution of the
variables. But, first of all: identify the type of data.

Examining qualitative data
What could make sense to calculate: Frequency table, minimum,
maximum, range, mode.
What graphical techniques can be applied: bar chart, pie chart.

Examining quantitative data
What could make sense to calculate: mean, mode, median, range,
interquartile range, standard deviation, variance, skewness, kurtosis.
What graphical techniques can be applied: scatterplot, histogram, boxplot.




The normal distribution is always reference for comparison. We should
detect the major features of the probability distribution of the variables.
The shape of the probability distribution is important for the measure of




centrality and dispersion of the data.

, What can we do with the characteristics of the data?
Design a correct model reproducing the features of the data. Choose an
adequate technique for the analysis:
- Is the sample size large enough?
- Are the assumptions required by the chosen analysis technique
satisfied by the data?
- Do we have all the necessary data to apply correctly the chosen
analysis technique?
Transform the data before studying them, if necessary.

Types of samples
Independent samples: the groups in the data do not correspond to each
other. The number of observations in each group can be different.
Matched pairs: the groups in the data correspond to each other. The
number of observations in each group are always the same.




Lecture 2 – Hypothesis Testing
Statistical inference and testing
Statistical inference is conclusions based on the sample. When we analyse
statistical data, we try to infer some characteristics of the process that has
generated the data.

Observing a sample and statistical inference does not provide ‘definitive’
conclusions, it just sizes up the different ‘maybes’.
Using a sample, we can make a:
- Confidence interval
- Hypothesis testing
- Model parameter estimation

Expected results come from probability theory. Observed results come
from experiments. Statistics links these two.

We can test if the unknown value of a parameter is equal to a chosen
value (or set of values): this is a hypothesis. Example:
We roll a die 10 times, write down the result and see that the sample
mean is 4.6. The standard deviation of the sample is s=1.35. Can we
infer that the die is a fair die?

A statistical test is a function of the observed data which gives just two
answers: reject / no not reject the null hypothesis. Often: the population
mean equals/does not equal the theoretical mean. Example:
H0: the population mean = 3.5
H1: the population mean does not equal 3.5

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper minjonhoogvliets. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €7,49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 67866 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen
€7,49  2x  verkocht
  • (0)
  Kopen