Class notes

College notes ARMS

12 views 0 purchase

Course
Advanced Research Methods and Statistics (202200104)

Institution
Universiteit Utrecht (UU)

These are the complete lecture notes for the Advanced Research Methods and Statistics (ARMS) course at UU. Written in English. I passed the course with these notes.

[Show more]

Preview 3 out of 28 pages

View example

Uploaded on January 9, 2024
Number of pages 28
Written in 2022/2023
Type Class notes
Professor(s) Onbekend
Contains All classes

college notes
complete
english
seminars
lectures

Institution
Universiteit Utrecht (UU)
Education
Psychologie
Course
Advanced Research Methods and Statistics (202200104)

clairenooijen

Member since 6 year 118 documents sold

$6.33

Add to cart

Add to wishlist

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Hoorcollege 1: Multiple Linear Regression

Two different frameworks:
- Frequentist framework: still mainstream (but Bayes is catching up), based on
nulhypothesis, p-values, confidence intervals, effect sizes, power analysis.
- Bayesian framework: increased attention since replication crisis -> against incorrect
interpretations of test results, p-hacking, underpowered studies, publication bias.

Both use empirical research collected data to learn from. Information in this data is captured
in a likelihood function.
- On x-axis: values for mu.
- On y-axis: likelihood of each value for mu and the observed data.

In frequentist approach: all relevant information for inference is contained in the likelihood
function.
In Bayesian approach: in addition to the likelihood function to capture the information in
the data, we may also have prior information about mu.
- Central idea: prior knowledge is updated with information in the data and together
provides posterior distribution for mu.
o Advantage: accumulating knowledge.
o Disadvantage: results depend on choice of prior.

Bayesian estimates and probability
The posterior distribution of the parameter(s) of interest provides all desired estimates:
- Posterior mean or mode: the mean or mode of the posterior distribution.
- Posterior SD: SD of posterior distribution (comparable to frequentist standard error).
- Posterior 95% credible interval: providing the bounds of the part of the posterior in
which 95% of the posterior mass is.

Bayes conditions on observed data (probability that hypothesis Hj is supported by the data);
whereas frequentist testing conditions on the nullhypothesis (p-value= probability of
observing same or more extreme data given that the null is true).

Researchers with hypotheses may prefer to get info on the probability that their hypotheses
are true.
- To what extent does the data support their hypotheses?
o PMP= Posterior Model Probability (the Bayesian probability of the hypothesis
after observing the data).
Bayesian probability of a hypothesis being true depends on 2 criteria:
- How sensible it is, based on current knowledge (the prior).
- How well it fits new evidence (the data).

,Furthermore, Bayesian testing is comparative: hypotheses are tested against one another,
not in isolation.
This is also seen in the Bayes factor: 𝐵𝐹 = 𝑃(𝑑𝑎𝑡𝑎|𝐻1)
- BF10=10 Support for H1 is 10 times stronger than for H0
- BF10=1 Support for H1 is as strong as support for H0
Posterior probabilities of hypotheses (PMP) are also relative probabilities.
PMPs are an update of prior probabilities (for hypotheses) with the BF.

Definition of probability
Both frameworks use probability theory, but:
- Frequentists: probability is relative frequency (more formal?)
- Bayesians: probability is degree of belief (more intuitive?)
This leads to debate (same word used for different things) and to differences in the correct
interpretation of statistical results. E.g., p-value vs PMP; also:
- Frequentist 95% confidence interval (CI): If we were to repeat this experiment many
times and calculate a CI each time, 95% of the intervals will include the true
parameter value (and 5% does not).
- Bayesian 95% credible interval: There is 95% probability that the true value is in the
credible interval.

Multiple linear regression

Least squares principle: the distance between each observation and the line, represents the
error. The blue line is drawn in a way that the residuals are as small as possible.
- Blue line is the model for linear regression, is a predicted outcome.
- Intercept (b0): where blue line hits the y-axis. Value of y when x is 0.
- Slope (b1): how steep the line is. Change in y when x increases by 1.
- Residual: the black dots.

The variables must be continuous.

Model assumptions
- All results are only reliable if assumptions made by the model and approach roughly
hold

, o Serious violations lead to incorrect results.
o Sometimes there are easy solutions (e.g. deleting a severe outlier; or adding a
quadratic term) and sometimes not.
o Per model, know what the assumptions are and always check them carefully.
- Multiple linear regression assumes interval/ratio variables (outcome and predictors).
The exception are the dummy variables.

Example RQ: Are gender and age predictors of grade?
- Grade on scale 0-10; numbers have numerical meaning. OK!
- Age in years; numbers have numerical meaning. OK!
- Gender coded as: 1 = male; 2 = female. Categorical; numbers do not have numerical
meaning. Not OK!
Multiple linear regression can handle dummy variables as predictors.
- Dummy variable has 0 and 1 (e.g., Dmale,i = 1 for males; 0 for females)
𝑔𝑟𝑎𝑑𝑒i = B0 + B1*Agei + B2*Dmale,i
- Interpretation of B2: difference in mean grade between males and females with the
same age.

Evaluating the model
- With frequentist (classical) statistics:
o Estimate parameters of model
o Test with NHST if parameters are significantly non-zero, e.g.
▪ H0: R2 = 0 versus HA: R2 > 0 (R = multiple correlation coefficient =
correlation between Y and the predicted Y), (R2 = the variation in the
model).
▪ H0: 𝛽 = 0 versus HA: 𝛽 ≠ 0
- With Bayesian statistics:
o Estimate parameters of model
o Compare support in data for different models/hypotheses using Bayes factors

Example:
Can Life Satisfaction be predicted from Age and Years of education?
- y = Life Satisfaction, x1=Age; x2=Years of education.

Must look at Adjusted R2, because this is the size in the population (variation in model), R2 is
only for the sample.

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller clairenooijen. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $6.33. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

48298 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 15 years now

Start selling

Popular Universities in the United States

Popular books

Find notes and summaries for these qualifications

Seller