Summary video transcript and assignments D. Lakens
179 views 4 purchases
Course
0HV110 Advanced research methods and ethics
Institution
Technische Universiteit Eindhoven (TUE)
summary of the video transcript and the text in the assignments of the module of Daniel Lakens. Part of the course 0HV110: Advanced research methods and research ethics. Only the text of week 7 is missing.
Week 1
Introduction
Goal of data collection is to draw statistical inferences and then draw theoretical inferences. But we
need to draw a good and valid statistical inference, because otherwise the next step doesn’t really
work.
There are some problems in the scientific literature at the moment:
- Many of the sample sizes are too small, so we have underpowered studies.
- People are flexible in how they analyze data. Which leads to fluxes, random findings.
- Strong publication bias: people share findings that show an effect, but they don’t share
findings that do not show an effect.
Lecture 1.1
Three different ways in which you can draw statistical inferences from data:
1. Path of action: uses rules to govern our behavior such that, in the long run, we won’t make a
fool out of ourselves too often. It uses p-values and alpha levels to make a decision to
reject/accept the H0 or to remain in doubt. It does not tell you anything about one single
test that you’re performing. Neyman-Pearson statistics.
2. Path of knowledge: focuses on likelihoods. What is the likelihood of different hypotheses
given the data that you have collected? It uses the relative events that are present in the
data, but it does not rely on the subjective prior.
3. Path of devotion: Bayesian statistics, it allows you to express evidence in terms of degrees of
belief. Use the relative events that are present in the data, but it relies on the subjective
prior.
Lecture 1.2
P-values are a way to prevent you from fooling yourself. They tell you how surprising the data is,
assuming that there is no effect.
When there is a mean difference between two groups, you can interpret this in two ways:
1. Random noise.
2. Real difference.
You can use the p-value to differentiate between the two options. From the data that we have, we
can calculate means, standard deviation and we know the sample size. With these we can calculate a
test statistic and compare it against a distribution.
With an alpha level of 0.05, the critical values are 1.96 and -1.96. Assuming that H0 is true, most of
the data will fall between these two values.
P-value: probability of getting the observed, or more extreme data, assuming H0 is true. When it is
larger than 0.05 (alpha), the data is not surprising. It does not mean that there is a true effect, there
might very well be, but you don’t have enough participants.
If you want to know the probability that the theory is true, you need Bayesian statistics.
P-values vary and even if you have examined a true effect, every now and then you’ll observe a non-
significant result. Whenever you find a non-significant effect you need to think about and explain it.
When there is no effect, p-values are uniformly distributed. Every p-value is equally likely.
Lecture 1.3
In the Neyman Pearson approach, the main thing is trying to control the errors when you draw
inferences from your data. The H0 can be anything. The alpha that we use is the probability of a
significant result when the H0 is true, this is a type 1 error.
, Beta is the probability of a non-significant result, when the alternative hypothesis is true. This is a
type 2 error. 1-β is the statistical power, the probability of the significant result when the alternative
hypothesis is true.
True negative: H0 is true and you have a non-significant result.
True positive: alternative hypothesis is true and you have a significant effect.
When you want to improve the probability of finding a true positive, you should examine hypothesis
that are likely to be true.
Week 1 assignments
Which p values you can expect is completely determined by the statistical power (probability that
you will observe a significant effect, if there is a true effect). When there is no true effect, p-values
are uniformly distributed under the null, so every p-value is equally likely. But when there is a true
effect, the p-value distribution depends on the power.
Lindley’s paradox: A result can be unlikely when the null hypothesis is true, but it can be even more
unlikely assuming the alternative hypothesis is true, and power is very high.
Week 2
Lecture 2.1
Likelihoods are a way to express the relative evidence for one hypothesis over another and they also
underlie Bayesian statistics. A likelihood gives you the function of a parameter given the data. And
with this you can check how likely each hypothesis is that you have.
A likelihood of 2 is considered moderately strong and a likelihood of 32 is strong evidence.
They are relative evidence for the alternative hypothesis compared to the null hypothesis. And both
hypothesis might be quite unlikely.
Lecture 2.2
Very often you have some prior belief that you might want to incorporate with the evidence and
Bayesian statistics allows this. It is not possible when you calculate P-values, which expresses the
probability of the data or more extreme data assuming that H0 is true.
Posterior probability: probability that the H0 is true, given some data that you have collected.
You combine your prior belief and the data into a posterior belief. And we can calculate posterior
odds that the alternative hypothesis is true given the data, compared that the probability that H0 is
true given the data. The posterior is the likelihood ratio X the prior. The beta prior is determined by
alpha and beta. If we set alpha and beta in the beta probability to 1, there is a perfect flat line. Every
value is equally likely, so we don’t have any expectations and say that anything is possible
(uninformative prior).
The beta prior distribution and the likelihood function can be combined in the posterior prior, which
also has an alpha and a beta value. The alpha and beta are determined by adding the alpha of the
prior to the alpha of the likelihood (same for the beta).
The Bayes factor is the relative evidence for one model compared to another model.
Bayesian estimation estimates which values we think are most likely. We also use the posterior
distribution to estimate plausible values, based on the prior and the data.
A credible interval contains 95% of the values that you find most plausible.
Lecture 2.3
If something is pretty rare to begin with a positive test result might not tell you so much yet and
follow up tests are required. Taking prior information into account can lead to better inferences.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller Kp2022. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $3.21. You're not tied to anything after your purchase.