Exam Guide for Applied Multivariate Data Analysis – Get yourself a Wonderful Grade!
Summary of Statistics (IBC), Radboud University
Answers assignment 3 business research methods
All for this textbook (117)
Written for
Universiteit van Amsterdam (UvA)
Business Administration
Quantitative Data Analysis 2
All documents for this subject (8)
Seller
Follow
bastudent
Reviews received
Content preview
QDA 2
LECTURES
WEEK 1
OV = Outcome Variable (Field)
- DV = Dependent Variable: test variable, variable to be explained
PV = Predictor Variable (Field)
- IV = Independent Variable: variable that explains
We are interested of the effect of a predictor variable on an outcome variable.
The p-value
- Stands for the probability of obtaining a result (or test-statistic value) equal to (or ‘more extreme’ than) what was actually
observed (the result you actually got), assuming that the null hypothesis is true.
- P ≤ 0.05
o Reject the null hypothesis and support the alternative hypothesis.
o Given the sample and the significance level of 5% there is sufficient support that the mean weight differs from 12g.
o A low p value indicates that the null hypothesis is unlikely.
- P > 0.05
- Do not reject the null hypothesis and do not support the alternative hypothesis.
- Given the sample and significance level of 5%, there is not sufficient support that the mean weight differs from 12g.
What is a conceptual model?
- Visual representations of relations between theoretical constructs and variables of interest.
- Model: simplified description of reality.
- The boxes represent variables.
- Arrows represent relationships between variables.
- Arrows go from predictor variables to outcome variables.
- Hypotheses refer to specific arrows e.g. relationships/effects/differences.
Levels of measurement of variables
- Categorical: subgroups are indicated by numbers. Made up of categories and names distinct entities.
o Nominal: two or more categories, in no particular order e.g. male and female.
o Ordinal: ordered categories e.g. small, medium, large.
- Quantitative: use numerical scales, with equal distances between values.
o Discrete: can take only certain values e.g. 1, 2, 3.
o Interval: equal intervals on the scale.
o Ratio: true and meaningful zero point e.g. time, income.
- In social sciences, we often treat ordinal scales as interval (pseudo) scales e.g. Likert scales (1 – 5 disagree to agree).
Moderation
- If the proposed effect is stronger in certain settings.
- Also called interaction.
- A moderator is a variable that affects the strength of the relation between
the predictor and outcome variable.
Mediation
- If the proposed relationship goes via another variable.
- A mediating variable explains the relation between the predictor and the
outcome variable.
Hypotheses
- H0: null hypothesis (rejected or not)
- H1: alternative/research hypothesis (supported or not)
- Hypotheses are developed prior to research. They are based on theory and previous research.
- Not all potential relationships need to be hypothesized:
o Every hypothesis refers to an arrow in the conceptual model.
o But not every potential arrow refers to a hypothesis.
- A hypothesis is a verbalized expression of an expected relationship between variables.
1
,One vs. two-sided testing
- If the hypothesis is one-sided, check if the hypothesis is in line with the results (e.g. mean plots).
- If they are in line (e.g. positive and right sided), divide the two tailed p-value by 2.
- If they are not in line, then by (1 – two tailed p-value/2).
Test Hypotheses
- Appropriate way to test hypotheses depends on:
o Nature of the relationship: derived from conceptual model.
• Main effects, moderation/interaction, mediation.
• Total direct, indirect effect.
o Nature of the data: not all of this is derived from conceptual model.
• Number of PV, number of OVs
• How are variables operationalized?
• Data type PVs, data type OVs
• If there are multiple groups: number of groups, relationship between them (dependent/independent).
Independent and Paired Samples T-test
- Paired-samples t tests compare scores on two different variables but for the same group of cases.
- Independent-samples t tests compare scores on the same variable but for two different groups of cases.
o Use when there is one quantitative outcome variable and one categorial predictor variable with two mutually exclusive
categories.
Analysis of Variance – ANOVA
- With ANOVA, we are examining how much of the variance in our data can be explained by our predictor variable.
- ideally 40 observations per group
One-way independent ANOVA
- One-way independent ANOVA: when the participants are different (independent groups) and there is only one predictor
variable.
- Conditions:
o One quantitative outcome variable (when the OV is quantitative – test on the mean)
o One categorical predictor variable
o Two or more mutually exclusive categories/groups (independent groups)
- Assumptions: need to adhere to these assumptions, in order to prevent invalid outcomes.
o Variance is homogeneous across groups.
o Residuals are normally distributed.
o Groups are roughly equal sized.
- Distinguish between:
o Number of categories within one categorial predictor variable.
o Number of predictor variables.
- Hypotheses:
o H0: μ1 = μ2 = … = μi
• i = number of categories
• No difference in OV mean across the different categories in PV.
o H1: at least one μ differs
• There is at least one difference in OV mean score between PV categories.
- Based on an F-Test
o Test statistic: F-test
o F-distribution looks different than t-distribution.
o F-values are looking to explain variability.
- ANOVA decomposes total variability observed in OV into variation explained by the model and residual variation.
o Explained variability: how much is caused by differences between groups?
o Unexplained variability: how much is caused by differences within groups?
o Prefer a larger proportion of the variability to be explained than unexplained.
Variability measures
- Variance: the averages of the squared differences from the mean.
- Sum of squares: the sum of the squared differences from the mean.
o Used for ANOVA analysis.
o Use squared deviations because we want positive outcomes.
2
, Sums of squares
SStotal = SSmodel + SSresidual
- Total sum of squares
o Squared deviations from grand overall mean.
o Total variability to be explained.
- Model Sum of Squares
o Between SS: explained variability.
o Squared deviations group means from grand overall mean.
o How much variability can be explained by differences between groups?
- Residual sum of squares
o Unexplained variability: within SS.
o Squared deviations observations from group means.
o How much variation within groups?
o Thus, not explained by the groups we compare.
How to use the sums of squares?
1. R2: proportion of total variance in our data that is “explained” by our model.
!!
o R2 = !!!
"
- Explained variability / total variability
- Model Sum of Squares / Total Sum of Squares
- An important and valuable indication but not a formal statistical test.
2. F-Test
- To investigate if the group means differ with an ANOVA, we do an F-test.
- This is a statistical test and checks the ration explained variability to unexplained variability.
"#$%&'(") +&,'&-'%'./ -".1""( 2,30$ +&,'&-'%'./
o F(ratio) = =
0("#$%&'(") +&,'&-'%'./ 1'.4'( 2,30$ +&,'&-'%'./
- We cannot divide the model sum of square by the residual sum of squares because they are not based on same number of
observations/df.
- We therefore divide by the degrees of freedom to get Mean Squares (MS)
5! !! /)7 !! /89:
o F = 5!! = 5!! /)7! = !! !/((98)
# # # #
- We want a large F value because this means that a larger proportion of the variability is explained.
- Degrees of freedom (df) one-way independent ANOVA:
o dfM = k-1
o dfR = n-k
o dfT = n-1
*k = number of categories
*n = number of observations
From F to p to conclusion H0
- F is a test statistic which means it has both a null hypothesis and an alternative hypothesis.
- From test statistics to p-value:
o From F-ratio to p-value (depends on df)
o Look in F-table for critical value: dfR and dfM
- From (critical) p-value to conclusion H0
o If F-ratio > critical p-value: reject H0
One-way independent ANOVA calculations example
Research question: is there a relation between shopping platform and customer satisfaction?
- PV = shopping platform (categorical) with 3 levels/categories:
o 1 Brick-and-mortar store
o 2 Web shop
o 3 Reseller
- OV = customer satisfaction (quantitative)
o Score from 1-50
- 10 observations – not realistic
- A 1-way independent ANOVA is appropriate because there is one quantitative outcome variable and one categorical
predictor variable with more than two mutually exclusive categories.
H0: μ1 = μ2 = μ3
H1: at least one μ differs
3
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller bastudent. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $8.03. You're not tied to anything after your purchase.