These are all the Grasple lessons you have to know for the ARMS midterm. In this document you can find all the answers to every question in Grasple, highlighted information that will be important and everything that was mentioned in Grasple. Good luck with studying!
,Inhoud:
Week 1:..................................................................................................................................................2
The Bayesian approach:......................................................................................................................2
Assumptions I:....................................................................................................................................5
Assumptions II:.................................................................................................................................10
Multiple linear regression, including hierarchical MLR:....................................................................14
Creating dummy variables:...............................................................................................................20
Multiple regression with dummy variables (interpretation):............................................................23
Week 2:................................................................................................................................................28
Factorial ANOVA: visually assessing main and interaction effects:...................................................28
Factorial ANOVA:..............................................................................................................................32
Follow-up testing (frequentist only):................................................................................................38
About multiple testing and error rates:............................................................................................43
Informative hypotheses (Bayes only):..............................................................................................45
Creating a JASP file:..........................................................................................................................49
Publishing your data and analyses:...................................................................................................56
Week 3:................................................................................................................................................59
Averages and corrected averages:....................................................................................................59
ANCOVA (Frequentist):.....................................................................................................................63
ANCOVA as regression:.....................................................................................................................69
ANCOVA (Bayesian):.........................................................................................................................72
Supporting the null hypothesis:........................................................................................................75
Week 4:................................................................................................................................................77
Within factors and between factors:................................................................................................77
The sphericity assumption:...............................................................................................................80
Repeated measures ANOVA with one factor:...................................................................................83
Two within factors: interpretation:..................................................................................................89
Mixed design RMA:...........................................................................................................................92
Week 5:................................................................................................................................................96
Moderation vs. mediation:...............................................................................................................96
Bootstrapping:................................................................................................................................100
Mediation analysis:.........................................................................................................................101
1
,Week 1:
The Bayesian approach:
The Bayesian framework is based on the posterior distribution of one or more
parameters. Let us be interested in estimating a mean μ representing a grade (scale
0-10).
o The information in our dataset provides information about what reasonable
values for μ could be (through what is called the likelihood function).
o But also the prior distribution provides information, that is, the knowledge or
belief about μ before we examine our data.
The posterior is a compromise (combination) of the prior and likelihood. Let's
examine this visually (formula's for this are beyond this course's goals)
The resulting mean in the sample (observed data set) turns out to be 8 (M=8.0).
Which prior could provide a posterior mean of 4?
Middle: Small values for μ in this prior distribution are much more likely than larger
values. The compromise between data (M=8) and prior will therefore result in a value
(substantially) lower than 8.
So, we have seen in these questions that priors can affect the posterior estimates.
But the first prior distribution (on the left) also showed an example of an ignorant
(uninformative, flat) prior.
Priors are sometimes seen as the bottleneck of the Bayesian approach because you
have to specify something, and it can affect the results.
Others consider priors an advantage of the Bayesian approach because we do not
start our research from scratch. We often build on earlier research or on existing
knowledge. This can then be incorporated in the prior and allows science to be
accumulative.
So:
Bayesian statistics assumes that we know more than just the frequency of an event in
a data set. We have some prior (= existing) knowledge (or beliefs) before we look at
our own data. In a Bayesian analysis, we add this prior knowledge or belief to the
analysis.
When you are using a Bayesian approach for your own research question, you will be
confronted with one very important issue:
2
, What prior (previous knowledge or beliefs) do you want to add to your own data
analysis? (Spoiler: There are many views on this.)
o I would ask an expert in the field. Experts might already know more about the
thing that interest you. Ultimately, it is up to you as a researcher to decide
whether this is an option for your research - and how much weight you want
to give it!
o I would look at previous studies that are close to my own research question. It
is up to you as a researcher to decide whether you want to include subjective
knowledge or beliefs in your analysis.
o I would not want to include previous studies or beliefs in my own research. It
is up to you as a researcher to decide whether you want to include subjective
knowledge or beliefs in your analysis.
Another important aspect of the Bayesian framework is the definition of probability.
In classical / frequentist statistics there is one underlying simple definition: The
probability of an event is assumed to be the frequency with which it occurs.
For example, if 150 out of 1000 people smoke, we could say that the probability that
some randomly picked person in that group of 1000 smokers is 0.15 (or 15 %). This is
the understanding of probabilities that is applied in the frequentist tests you know.
In Bayesian statistics, we use a different way of looking at probabilities.
The foundation of Bayesian statistics is Bayes theorem.
Central in Bayes theorem are conditional probabilities.
E.g. P(A given B) : What is the probability that A will happen or is true given that we
know B has happened or is true?
If we fill in that A stands for a hypothesis of interest and B for data we collected,
then P(A given B) represents the probability of our hypothesis given the data we
observed in our study. Is that not exactly what we are interested in?
Note the difference with the definition of the p-value: "the probability of observing
these (or more extreme) data assuming that the null hypothesis is true".
To obtain P(A|B) (A|B is another way of writing A given B) an ingredient is needed
that is not part of the frequentist approach: P(A), the prior probability of the
hypothesis.
The Bayesian use of conditional probabilities means that we approach an analysis in a
different way.
We integrate previous knowledge and beliefs about the thing we are interested in
and then update our knowledge and beliefs based on the evidence we find in our
data.
3
, The different definition of probability used in the Bayesian framework also implies
that the interpretation of results is somewhat different. And according to Bayesians:
the Bayesian interpretation is more intuitive.
Let us first look at estimation using 95% estimation intervals.
A frequentist interval is called a confidence interval. A Bayesian interval is called a
credible interval. Read the following definition:
"If we were to repeat this experiment many times and calculate an interval each
time, 95% of the intervals will include the true parameter value (and 5% does not)"
Does this definition of an estimation interval belong to the confidence or the credible
interval?
Confidence: We cannot talk about the probability that the true value is in the interval
because it either is or is not. There are no probabilities connected to parameters
because they do not represent something that can frequently be repeated (see the
definition of probability in the frequentist framework).
A frequentist interval is called a confidence interval. A Bayesian interval is called a
credible interval. Read the following definition:
"There is 95% probability that the true value is in the interval."
Does this definition of an estimation interval belong to the confidence or the credible
interval?
Credible
As stated before: The different definition of probability used in the Bayesian
framework also implies that the interpretation of results is somewhat different. And
according to Bayesians: the Bayesian interpretation is more intuitive.
And after discussing estimation intervals we will now turn to hypothesis testing.
The definition of a (frequentist) p-value is perhaps not exactly what we are looking
for. It is the probability of observing the same or more extreme data given that the
null hypothesis is true. But this does not provide information on how likely it is that
the null is true given the data.
A Bayesian probability can provide information about this: How likely is the null, or
any other hypothesis, given the data we observed?
It is however important to know that Bayesians measure the relative support for
hypotheses. Two hypotheses are compared, or tested against one another, using the
Bayes factor (BF).
A BF12 of 10 means that the support for H1 is 10 time stronger than the support for
H2.
This does not imply that H1 is an excellent or perfect or true hypothesis; there can
exist a H3 that receives much more support than H1.
Thinking about useful, reasonable and informative hypotheses is thus step 1.
Because only the formulated hypotheses are tested (against one another).
Bayes factors and their interpretations will return later in the course.
A BF is not a probability but BFs can be transformed into (relative) probabilities.
4
, First we have to define prior model probabilities: i.e., how likely is each hypothesis
before seeing the data.
The most common choice is that before seeing data each hypothesis is considered
equally likely. This provides:
o for interest in 2 hypotheses H1 and H2: P(H1) = P(H2) = 0.5
o for interest in 3 hypotheses H1, H2 and H3: P(H1) = P(H2) = P(H3) = 0.333
o for interest in 10 hypotheses H1, ..., H10: P(H1) = ... = P(H10) = 0.1
The prior probabilities add up to one because they are relative probabilities divided
over the hypotheses of interest. (note this is also the case for unequal prior
probabilities that could be defined just as well)
The posterior model probabilities (PMP) also add up to one (and they are also
relative probabilities).
Consider a set of just 2 hypotheses, H1 and H2. The relative probability before
collecting data is chosen to be equal. That is P(H1) = P(H2) = 0.5. The data reports
that H1 receives 3 times more support than H2, that is BF=3. Given the equal prior
probabilities, the resulting PMP's for H1 and H2 will represent that same relative
support. Can you formulate the PMPs without needing a formal equation? Note that
they need to add up to 1.
For equal prior probabilities and BF12=3 , what is the PMP of H1?
0,75 : PMP(H1) =0.75 and PMP(H2) = 0.25 also shows that H1 receives 3x stronger
support.
Assumptions I:
For this lesson, all output is provided. How to obtain the output is explained in a
lesson, with a title that starts with JASP.
Consider that we are interested in predicting how satisfied young people are with
their lives, with several predictor variables.
For this research, data was collected from 98 randomly selected young people
through questionnaires.
Within this datafile are the following variables:
Satisfaction: measured with the Life Satisfaction Scale (1-100)
Age: measured in years
Gender: (0 = male, 1 = female)
Sports: sport participation measured in number of hours per week
Parents: support from parents (scale of 1-10)
Teachers: support from teachers (scale of 1-10)
SES: socio-economic status (1 = low, 2 = medium, 3 = high)
Within this lesson, we will examine the assumptions we should check.
Assumptions about the measurement level of variables in MLR:
Assumption: the dependent variable is a continuous measure (Interval or Ratio).
Assumption: the independent variables are continuous or dichotomous.
Is Satisfaction a continuous variable?
5
, Yes: Satisfaction is measured on a scale of 1-100. Composite scales can be used as if
they were continuous. If the dependent variable is nominal or ordinal, it is not
possible to use linear regression. There are other regression methods which are
suitable for dependent variables with other measurement levels, however these are
beyond the scope of this course.
The independent variable(s) must be continuous or dichotomous (nominal with two
categories).
Are all independent variables in this study either continuous or dichotomous? (Age,
sports and gender)
Yes: age and sports participation are of interval measurement. Gender is a
dichotomous variable in this data set, that is, there are two categories present: male
and female.
Another assumption in MLR is linearity of relations (the L in MLR).
Assumption: there are linear relationships between the dependent variable and each
of the continuous independent variables.
This can be checked using scatterplots. A scatterplot has the (continuous) predictor
on the x-axis and the outcome on the y-axis and uses dots to represent the
combination of x-y scores for each case in the data.
A linear relation means that the scatterplot of scores has an oval shape that can be
described reasonably well by a linear line (i.e., not a curved or s-shaped
relationship).
Examples of a curved relation and a s-shaped relation:
Examine the 4 plots below. The first is a histogram showing the distribution of the
Satisfaction scores. This can be informative (e.g., for spotting outliers in Satisfaction),
but not for the investigation of linear relations. The other 3 are scatterplots although
only the 2nd and 4th are interesting for investigating linearity between x and y
(because only Age and Sports are continuous variables; Gender is not).
6
, Is the relationship between age and satisfaction linear?
Yes, as data points form an oval, the relationship could be described using a straight
line. Therefore it is possible to include this variable (age) as an independent variable
within the analysis.
Is the relationship between sports participation and satisfaction linear?
Yes, as data points are oval, the relationship could be described using a straight line.
Therefore it is possible to include this variable (sports) as an independent variable
within the analysis.
Non-linear relations
When a relation between a continuous predictor (x) and the outcome (y) is not linear,
you can add additional terms to the regression model to accommodate the non-
linearity. We will only discuss one example.
Assume the relation has one curve (see plot at bottom). Then a quadratic relation
may better present the observed relation between x and y than the linear relation.
This is achieved by computing a new variable, the squared version of the original X
and running the regression with both variables
X and X2 as predictors. You then get 2 parameter estimates B1 and B2, where:
o B1 informs you about the steepness of the overall slope (the linear trend in the
curved relation). The p-value when testing B1 informs you whether the linear
trend is zero (horizontal) or not (when p<.05).
o B2 informs you about how curved the relation is, or stated differently, it
measures the change in slope with increasing X. In the plot below, for instance,
we see that the line is steeper for larger values of X. The p-value when testing B2
informs you whether the change in slope is significantly non-zero. It basically tells
you if the quadratic relation is a better model for your data than the linear
relation.
7
, Assumption: there are no outliers.
An outlier is a case that deviates strongly from other cases in the data set. This can be
on one variable (e.g. everybody in the data has values between 20-25 on this variable
but one person scored 35), on 2 variables (e.g., one dot in the scatterplot is far
outside the oval cloud that contains the other dots), or on a combination of even
more variables (then numerical instead of visual inspection is easier).
For now, we focus on the x-y relation (for each x separately), so we will start by
looking at scatterplots.
When looking at these figures, do you think that the condition is met that no outliers
are present?
No: The scatterplot for satisfaction and age shows that there is a respondent with a
significantly younger age (8 years) then the other respondents. Since this study was
conducted amongst young people, this age is unlikely. Based on this, it is logical to
decide not to include this respondent in the rest of the analyses, as this respondent
does not belong to the target group of the study. There are no outliers in the scatter
plot for satisfaction and sports participation.
It is not always an easy decision how to deal with outliers. Sometimes it is clear that it
must be a typo (data entry error) and then you can either correct (if information is
available to do that) or delete the value (because you know it is wrong).
Often it is not clear why the outlier exists. If it has large impact on results you can still
decide to remove it. Or sometimes it is changed to a less extreme value. E.g., if a case
scored much higher than all others, you can change the score to, for instance, the
mean+2*SD. This way this case still has a large score but not so extreme that it will
completely dominate the results of the analysis.
Very important is transparency about any alterations to the data (and the motivation
for doing so).
For our example, where one case had an unexpected low age, we will compare the
results after removing this case with the 'before removing the outlier' results.
Plots are provided on the next slide.
Is the relationship between age and satisfaction after removing the outlier stronger
or weaker?
Before removing the outlier:
8
, After removing the outlier:
Stronger: The line in the age-satisfaction plot tracks the data more closely than
before. This means the relationship is now a stronger linear relationship.
The influence of a violated model assumption on the results can be severe. Therefore
it is important to visualize your data. This is also shown by the Anscombe Quartet
(Anscombe, 1973), describing four data sets that have several equal statistical
properties. The variables X and Y have the same average and the same variance
across all data sets, with the correlation and regression line also being exactly the
same.
The figure below shows how the scatter plots for X and Y look for each data set.
Consider which of the four data sets meets the assumptions of a linear regression.
Which of the four data sets meets the assumptions of a linear regression?
9
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller annewilbiesheuvel. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $7.50. You're not tied to anything after your purchase.