bedrijfswetenschappen
Notities
Docent: Patrick Wessa
Academiejaar: 2021 - 2022
,
,INHOUDSOPGAVE
HOOFDSTUK 1: Getting Started ...................................................................................................... 1
1.1.2.0.6 Compendium.................................................................................................................. 1
1.4.1 Werking R Framework ......................................................................................................... 1
1.4.2 Univariaat ............................................................................................................................ 1
1.4.3 Bivariaat............................................................................................................................... 1
1.4.4 Trivariaat .............................................................................................................................. 1
1.4.5 Multivariaat .......................................................................................................................... 1
1.4.7 Reproduceren ...................................................................................................................... 1
1.5 Collaborative Compendium Writing ......................................................................................... 1
1.7 Instant Messaging ................................................................................................................... 2
HOOFDSTUK 2: Introduction to Probability...................................................................................... 3
2.1 Definities van waarschijnlijkheden........................................................................................... 3
2.1.0.0.2 Doorsnede ..................................................................................................................... 3
2.1.0.0.3 Unie ............................................................................................................................... 3
2.1.0.0.6 Exclusiveness ................................................................................................................ 3
2.3. Theorema van Bayes ............................................................................................................. 4
2.3.0.0.2 Voorbeeld ...................................................................................................................... 6
2.3.0.0.3 Sensitiviteit en Specificiteit ............................................................................................. 7
2.4 Multinomiale Naive Bayes Classificatiemodel ......................................................................... 9
2.4.2 Voorbeeld ............................................................................................................................ 9
2.4.3 Interactie-effecten .............................................................................................................. 10
2.4.4 Nulkansen .......................................................................................................................... 11
2.4.5 Types of Naive Bayes Classifiers ....................................................................................... 12
2.5 Wet van de grote getallen ..................................................................................................... 12
2.5.1 Weak Law of Large Numbers ............................................................................................. 16
HOOFDSTUK 3: Probability Distributions ...................................................................................... 19
3.1. Statistische maatregelen van de waarschijnlijkheidsverdeling .............................................. 19
3.2 Discrete verdelingen ............................................................................................................. 19
3.2.1 Bernoulli verdeling ............................................................................................................. 19
3.2.1.6 Doel ................................................................................................................................ 19
3.2.2 Binomiale verdeling............................................................................................................ 20
,3.2.2.6 R Module ........................................................................................................................ 20
3.2.2.7 Voorbeeld ....................................................................................................................... 22
3.2.3 Multinomiale verdeling ....................................................................................................... 22
3.2.3.4. Doel ............................................................................................................................... 22
3.3 Continue verdelingen ............................................................................................................ 22
3.3.1 Uniforme verdeling ............................................................................................................. 22
3.3.1.1 Dichtheidsfunctie............................................................................................................. 23
3.3.1.2 Verdelingsfunctie ............................................................................................................ 23
3.3.1.11 Doel .............................................................................................................................. 23
3.3.1.12 Voorbeeld ..................................................................................................................... 23
3.3.2 Normaalverdeling (of Gauss curve) .................................................................................... 24
3.3.2.1 Dichtheidsfunctie............................................................................................................. 24
3.3.2.2 Verdelingsfunctie ............................................................................................................ 24
3.3.2.19 Parameter Estimation.................................................................................................... 25
3.3.2.19.1 R Module ................................................................................................................... 25
3.3.2.19.2 Voorbeeld .................................................................................................................. 26
3.3.2.20 Random number generator ........................................................................................... 27
3.3.2.20.1 R Module ................................................................................................................... 28
3.3.2.20.2 Voorbeeld .................................................................................................................. 30
3.3.2.34 Doel .............................................................................................................................. 30
3.3.2.35 Gaussian (of : Normal) Naive Bayes Classifier .............................................................. 30
3.3.2.35.2 R Module ................................................................................................................... 30
3.3.2.35.3 Voorbeeld .................................................................................................................. 34
3.3.3 Chi verdeling ...................................................................................................................... 34
3.3.3.4 Random number generator ............................................................................................. 34
3.3.4 Chi-kwadraatverdeling (met 1 parameter) .......................................................................... 34
3.3.4.11 R Module ...................................................................................................................... 35
3.3.4.12 Voorbeeld ..................................................................................................................... 35
3.3.4.13 Random number generator ........................................................................................... 36
3.3.4.16 Relaties met andere functies ......................................................................................... 36
3.3.6 Student T verdeling ............................................................................................................ 37
3.3.6.9 Random number generator ............................................................................................. 37
3.3.7 Fisher F verdeling .............................................................................................................. 38
, 3.3.7.9 Random generator .......................................................................................................... 38
Conclusies .................................................................................................................................. 38
HOOFDSTUK 4: Descriptieve statistiek en exploratieve data analyses.......................................... 41
4.1 Types of data ........................................................................................................................ 41
4.1.1 Kwalitatieve data ................................................................................................................ 41
4.1.2 Kwantitatieve data.............................................................................................................. 41
4.2 Kwalitatieve data................................................................................................................... 42
4.2.2 Frequency Plot................................................................................................................... 42
4.2.2.2 R Module ........................................................................................................................ 42
4.2.2.3 Doel ................................................................................................................................ 43
4.2.3 Frequentietabel .................................................................................................................. 43
4.2.3.2 R Module ........................................................................................................................ 43
4.2.4 Contingentietabel ............................................................................................................... 43
4.2.4.2 Voorbeeld ....................................................................................................................... 44
4.2.5 Binomiale classificatie maatstaven. .................................................................................... 44
4.2.5.2 Voorbeeld ....................................................................................................................... 44
4.2.5.3 Confusion Matrix ............................................................................................................. 45
4.3 Kwantitatieve data ................................................................................................................ 46
4.3.1 Stem-and-Leaf Plot (NL: stam en blad) .............................................................................. 46
4.3.1.2 R Module ........................................................................................................................ 46
4.3.1.4.1 Voordeel ...................................................................................................................... 46
4.3.1.5 Voorbeeld ....................................................................................................................... 47
4.3.2 Histogram .......................................................................................................................... 47
4.3.2.2 R Module ........................................................................................................................ 47
4.3.2.3 Doel ................................................................................................................................ 49
4.3.2.4.2 Nadeel ......................................................................................................................... 49
4.3.2.5 Voorbeeld ....................................................................................................................... 49
4.3.3 Kwantielen ......................................................................................................................... 50
4.3.3.1 Kwantielen gebaseerd op gewogen gemiddelden op Xnq ................................................ 50
4.3.3.1.2 Voorbeeld .................................................................................................................... 50
4.3.3.9 Harrel-Davis kwantielen .................................................................................................. 52
4.3.4 Central Tendency............................................................................................................... 53
4.3.4.2 Rekenkundig gemiddelde ................................................................................................ 53
,4.3.4.2.9 Nadelen ....................................................................................................................... 53
4.3.4.3 Gewogen gemiddelde ..................................................................................................... 53
4.3.4.4 Geometrisch gemiddelde ................................................................................................ 53
4.3.4.4.2 Doel ............................................................................................................................. 53
4.3.4.4.4 Voorbeeld .................................................................................................................... 53
4.3.4.5 Harmonisch gemiddelde ................................................................................................. 54
4.3.4.5.4 Voorbeeld .................................................................................................................... 54
4.3.4.5.6 Nadelen ....................................................................................................................... 55
4.3.4.6 Kwadratisch gemiddelde ................................................................................................. 55
4.3.4.7 Root Mean Square .......................................................................................................... 55
4.3.4.12 Mediaan ........................................................................................................................ 55
4.3.4.12.2 Doel ........................................................................................................................... 55
4.3.4.12.3 Voorbeeld .................................................................................................................. 56
4.3.4.13 Midrange of Midextreme ............................................................................................... 56
4.3.4.13.3 Voorbeeld .................................................................................................................. 56
4.3.4.15 Tukey’s Trimean ........................................................................................................... 56
4.3.4.17 Trimmed Mean .............................................................................................................. 57
4.3.4.20 Doel van de Central Tendency ...................................................................................... 58
4.3.5 Variabiliteit ......................................................................................................................... 58
4.3.5.1 Range ............................................................................................................................. 58
4.3.5.4 Variantie (biased) ............................................................................................................ 58
4.3.5.5 Variantie (unbiased) ........................................................................................................ 59
4.3.5.6 Standaarddeviatie (biased) ............................................................................................. 59
4.3.5.12 Mean Absolute Deviation (MAD) ................................................................................... 59
4.3.5.17 Interkwartiel verschil...................................................................................................... 59
4.3.5.31 R Module ...................................................................................................................... 59
4.3.6.6 D’Agostino Skewness Test ............................................................................................. 60
4.3.6.8 Definition of Kurtosis ....................................................................................................... 60
4.3.6.12 Simultaan Skewness & Kurtosis testen ......................................................................... 60
4.3.6.14 R Module ...................................................................................................................... 61
4.3.8 Notched Boxplot ................................................................................................................ 63
4.3.8.16 Voordelen ..................................................................................................................... 65
4.3.8.17 Voorbeeld ..................................................................................................................... 65
,4.3.9 Scatterplot ......................................................................................................................... 66
4.3.9.2 R Module ........................................................................................................................ 66
4.3.9.5 Voorbeeld ....................................................................................................................... 68
4.3.10 Pearson Correlatie ........................................................................................................... 68
4.3.10.3 Determinatiecoëfficiënt.................................................................................................. 68
4.3.10.5 R Module ...................................................................................................................... 69
4.3.10.7 Phi coëfficiënt ............................................................................................................... 70
4.3.10.8.2 Nadelen ..................................................................................................................... 71
4.3.10.10 Taak............................................................................................................................ 71
4.3.11 Rank Correlation .............................................................................................................. 72
4.3.11.1 Spearman Rank Order Correlatie .................................................................................. 72
4.3.11.2 Kendall ’s Rank Order Correlatie ................................................................................... 72
4.3.11.3 R Module ...................................................................................................................... 72
4.3.11.5.1 Voordelen .................................................................................................................. 73
4.3.11.6 Voorbeeld 1 .................................................................................................................. 73
4.3.11.7 Voorbeeld 2 .................................................................................................................. 73
4.3.12 Partiële Pearson Correlation ............................................................................................ 74
4.3.12.2 R Module ...................................................................................................................... 74
4.3.12.5 Voorbeeld ..................................................................................................................... 75
4.3.13 Enkelvoudige Lineaire regressie ...................................................................................... 76
4.3.13.1.1 Model Assumptie 1 ................................................................................................ 76
4.3.13.1.2 Model assumptie 2 ..................................................................................................... 77
4.3.13.1.2 Model assumptie 3 ................................................................................................ 77
4.3.13.2 R Module ...................................................................................................................... 77
4.3.15 Kwantiel-Kwantiel Plot (QQ Plot) ...................................................................................... 78
4.3.15.2 R Module ...................................................................................................................... 79
4.3.15.3 Doel .............................................................................................................................. 80
4.3.17 Probability Plot Correlation Coefficient Plot (PPCC Plot) .................................................. 80
4.3.17.2 R Module ...................................................................................................................... 81
4.3.17.5 Voorbeeld ..................................................................................................................... 83
4.3.18 Kernel Density Estimation ................................................................................................ 83
4.3.18.5 Gaussian Kernel ........................................................................................................... 84
4.3.18.7 R Module ...................................................................................................................... 84
, 4.3.18.10 Voorbeeld ................................................................................................................... 85
4.3.19 Bivariate Kernel Density Plot ............................................................................................ 85
4.3.19.2 R Module ...................................................................................................................... 85
4.3.19.5 Voorbeeld ..................................................................................................................... 86
4.3.20 Bootstrap Plot (voor Central Tendency) ........................................................................... 87
4.3.20.2 R Module ...................................................................................................................... 87
4.3.20.5 Voorbeeld ..................................................................................................................... 91
4.3.21.5 Voorbeeld ..................................................................................................................... 91
4.3.22 Cronbach Alpha ............................................................................................................... 92
4.3.22.2 R Module ...................................................................................................................... 93
4.3.22.5 Voorbeeld ..................................................................................................................... 93
4.4 Kwantitatieve data met tijdsdimensie (tijdreeksen) ................................................................ 94
4.4.1 Equi-distante tijdreeksen .................................................................................................... 94
4.4.2 Tijdreeks Plot ..................................................................................................................... 94
4.4.2.2 R Module ........................................................................................................................ 95
4.4.3. Mean Plot ......................................................................................................................... 95
4.4.3.2. R Module ....................................................................................................................... 96
4.4.4 Blocked Bootstrap Plot (Central Tendency)........................................................................ 99
4.4.4.2 R Module ........................................................................................................................ 99
4.4.4.5 Voorbeeld ....................................................................................................................... 99
4.4.5 Standard Deviation-Mean Plot ........................................................................................... 99
4.4.5.5 Voorbeeld ..................................................................................................................... 100
4.4.6 Variantie reductie matrix .................................................................................................. 101
4.4.6.5 Voorbeeld ..................................................................................................................... 101
4.4.7 Partiële autocorrelatie functie ........................................................................................... 103
4.4.7.5 Voorbeeld ..................................................................................................................... 103
4.4.8 Periodogram .................................................................................................................... 106
4.4.8.5 Voorbeeld ..................................................................................................................... 107
HOOFDSTUK 5: HYPOTHESIS TESTING .................................................................................. 109
5.1.2.1 Grafiek van de normaalverdeling .................................................................................. 109
5.1.2.2 Interpretatie van standaarddeviatie ............................................................................... 109
5.2 Populatie............................................................................................................................. 110
5.9 Statistische test voor een populatiegemiddelde met een gekende variantie ........................ 110
,R Module .................................................................................................................................. 118
5.17 Toetsen van Hypothese voor onderzoek ........................................................................... 120
5.17.1 One Sample t-Test ......................................................................................................... 120
5.17.1.2 Analyse gebaseerd op kritieke waarden ...................................................................... 120
5.17.1.3 Analyse gebaseerd op p-waarden ............................................................................... 123
5.17.1.5 Alternatieven ............................................................................................................... 124
5.17.2 Skewness & Kurtosis tests ............................................................................................. 125
5.17.2.1.1 D’Agostino skewness test ........................................................................................ 125
5.17.5.1.2 Kurtosis test ............................................................................................................. 125
5.17.2.4 Alternatieven ............................................................................................................... 127
5.17.3 Gepaarde Two Sample t-Test ........................................................................................ 127
5.17.5 Unpaired Two Sample t-Test.......................................................................................... 129
5.17.5.1 Hypotheses - examples............................................................................................... 129
5.17.5.2 Analyse gebaseerd op p-waarden ............................................................................... 130
5.17.5.3 Assumpties ................................................................................................................. 132
5.17.5.4 Alternatieven ............................................................................................................... 132
15.7.6 Unpaired Two Sample Welch Test ................................................................................. 133
15.7.6.2 Analyse op basis van p-waarden ................................................................................ 133
5.17.7 Mann-Whitney U test ..................................................................................................... 133
5.17.7.1 Classical model ........................................................................................................... 134
5.17.7.1.2 Randomization model .............................................................................................. 134
5.17.7.2 Analyse op basis van p-waarden ................................................................................ 134
5.17.8 Bayesian Two Sample Test ........................................................................................... 135
5.17.9 Mediaan Test op basis van Notched Boxplots ................................................................ 135
5.17.10 Chi-kwadraat test for Count Data ................................................................................. 135
5.17.10.1 Pearson Chi-Kwadraat Test ...................................................................................... 135
5.17.10.1.4 Analyse gebaseerd op p-waarden – Output ........................................................... 136
5.17.10.1.5 Assumptie .............................................................................................................. 137
5.17.10.2 Exacte Pearson Chi-kwadraat Test met simulatie. .................................................... 137
5.17.11 One way analysis of Variance (1-way ANOVA) ............................................................ 138
5.17.11.2 Analyse gebaseerd op p-waarden ............................................................................. 138
5.17.12 Two Way Analysis of Variance (2-way ANOVA) ........................................................... 142
5.17.12.1 Analyse gebaseerd op p-waarden ............................................................................. 142
, 5.17.13 Testing Correlations ..................................................................................................... 147
5.17.14 Nota bij causaliteit ........................................................................................................ 147
HOOFDSTUK 6: Regressie modellen .......................................................................................... 149
6.1 Enkelvoudige lineair regressie model (Simple Lineair Regression Model: SLRM) ............... 149
6.1.2 Kleinste kwadratencriterium (Least Squares Criterion) ..................................................... 149
6.1.3 Ordinary Least Squares for Simple Linear Regression ..................................................... 150
6.1.4 Assumpties om regressiemodel op te stellen ................................................................... 151
6.1.5 Statistische eigenschappen van 𝛼 en 𝛽 ........................................................................... 151
6.1.5.2 Betrouwbaarheidsintervallen van eenvoudige lineaire regressieparameters ................. 153
6.2 Meervoudig lineair regressiemodel (Multiple Linear Regression Model: MLRM) ................. 154
6.2.1.3 Unbiasedness of b ........................................................................................................ 157
6.2.1.4 Minimum variantie (Gauss-Markov Theorema) ............................................................. 157
6.2.1.7 Determinatie coëfficiënt R² ............................................................................................ 158
6.2.1.8 Relatie tussen het SLRM en het MLRM ........................................................................ 158
6.2.2 Maximum Likelihood Estimation for Multiple Linear Regression ....................................... 159
Zelf regressiemodel maken met behulp van Excel en RFC ....................................................... 169
RFC: Multiple Regression (volledig uitgelegd) .......................................................................... 175
HOOFDSTUK 7: Introductie tot tijdreeksanalyse .......................................................................... 193
7.2 Case: the Market of Health and Personal Care Products .................................................... 193
7.3. Decompositie van tijdsreeksen .......................................................................................... 193
7.3.1. Klassieke decompositie van tijdsreeksen met “moving averages” ................................... 193
7.3.2 Seizoenale decompositie volgens Loess.......................................................................... 196
7.3.3. Decompositie volgens structurele tijdreeksmodellen. ...................................................... 197
7.4 Ad hoc forecasting van tijdreeksen ..................................................................................... 199
7.4.1 Regressieanalyse van tijdreeksen.................................................................................... 199
7.4.2 Smoothing Models ........................................................................................................... 203
7.4.2.4 Single Exponential Smoothing ...................................................................................... 203
7.4.2.5 Double Exponential Smoothing ..................................................................................... 204
7.4.2.6 Triple Exponential Smoothing (Holt-Winters model) ...................................................... 205
HOOFDSTUK 8: Univariate Box-Jenkins analyse ........................................................................ 211
8.2 Data .................................................................................................................................... 211
8.3 Theoretical Concepts .......................................................................................................... 212
8.3.0.1 Stationair Processes ..................................................................................................... 212