100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
Zeer uitgebreide college aantekeningen Research Methods II (1a t/m 7a) $7.06   Add to cart

Class notes

Zeer uitgebreide college aantekeningen Research Methods II (1a t/m 7a)

 16 views  0 purchase
  • Course
  • Institution
  • Book

In de afgelopen weken heb ik alle colleges van Research Methods 2 vrijwel letterlijk woord-voor-woord overgenomen. Het document is dus een soort samenvatting van alle colleges in jip-en-janneke taal met veel voorbeelden die door Mr. Lehr worden gegeven.

Preview 4 out of 132  pages

  • March 19, 2021
  • 132
  • 2020/2021
  • Class notes
  • Alex lehr
  • All classes
avatar-seller
Research Methods II – Notes

Inhoud
Table of Contents
Week 1 – lecture A.................................................................................................................................2
Week 1 – lecture B (Central tendency, dispersion).................................................................................2
Week 2 – lecture A (Regression analysis, simple linear regression model, model estimation and model
fit)...........................................................................................................................................................7
Week 2 – lecture B (Elaboration of OLS estimation of coefficients and standard error, t-test, model fit,
from descriptive statics to linear regression).......................................................................................18
Week 3 – lecture A (Regression and correlation, multiple regression, model comparison, coefficients,
multicollinearity, sample size)..............................................................................................................26
Week 3 – lecture B (Multiple regression, standardized coefficients, nested models and
multicollinearity)..................................................................................................................................39
Week 4 – lecture A (Regression analysis, F test as multi-parameter test, modelling non-linearity,
Linear OLS regression)..........................................................................................................................49
Week 4 – lecture B (Bias and assumptions, outliers, causality: exogeneity and endogeneity).............59
Week 5 – lecture A (recap multiple regression to estimate causal effects, causal mechanism and
causal effects, moderation interaction variables and centering, mediation path analyses).................69
Week 5 – lecture B (recap moderation and mediation, interpreting interaction effects, mediation,
suppression and spurious relationships)..............................................................................................83
Week 6 – lecture A (linear multi regression, categorical predictors (dummy coding and
interpretation), categorical dependent variables)................................................................................96
Week 6 – lecture B (categorical variables, working with dummies, exponential and logarithmic
transformations, probabilities, odds and log-odds)............................................................................110
Week 7 – lecture A (odds & log-odds, problems with using linear regression on categorical Y’s,
introduction to logistic regression, interpretation of effects in logistic regression)...........................122




1

,Week 1 – lecture A
Lecture A van week 1 gaat eigenlijk vooral over de uitleg van de course en hoe de course
manual in elkaar steekt. Ik heb hier geen aantekeningen van gemaakt omdat dit niet nodig
was.


Week 1 – lecture B (Central tendency, dispersion)
We are going to talk about three basic concepts which are: Central tendency, Dispersion and
Inference: from sample to population.

Inference is the challenge of saying something about the population based on a sample.
Greek letters  population
Roman letters  sample.

Central tendency (1)
We mean that we somehow want to describe the central value of a distribution of a
variable. We want to describe so it is descriptive. We want to do this through a number.
We are going to calculate some statistics so a descriptive statistic. This is going to
describe the central value of the distribution of a variable.
A variable is some characteristic that we have measured for our cases. For our
observations. It can vary, it can take different values for different observations for different
cases.
The variable is: how many people like statistics. Some students will love statistics but
some may not so much. Some actively hate statistics. So different students will take
different values on the variable liking statistics. We get differences. Look at the distribution
of the different variables.

The extreme values will not be that common. The students that absolutely hate statistics
will not be super large and also the students who absolutely love statistics will not be that
large.

Central tendency = liggingsmaat, centrummaat. Met een centrummaat wordt een indruk
gegeven van het centrum van een hoeveelheid gegevens of een verdeling.
Central value = the idea that there is somewhere a central value that indicates what is the
most likely. Where most values are contributed.
Mean = arithmetic mean. This one is fundamental for the course. You have to understand
what a mean is and how to interpret it and how to calculate the mean.
Median = line up all the values from small to large and chop your data in the middle, you
find the median.
Mode = the most often observed value.
Both the median and the mode are not very relevant for this course.

X-as is horizontal this indicates the values of the variable.
The shape of the curve indicates how many observations we have for all those values. You
see the bell-shaped curve. The extreme values are not so common and the values in-
between (more towards the middle of the scale) are more common. The idea of the mean
is that it indicates the top of the distribution. The central value. In other words  the most
likely value that you can find in your data. The mean is a measure of the central tendency
that is supposed to reflect the top of this distribution. The mean represents the central
tendency of our distribution.

You can calculate the mean by using a formula. If you want to know the population mean,
you will use Greek letters. If you want to know the sample mean, you will use Roman

2

,letters.


Central tendency (2)
You have an interest in analysing exam grades for PRSM II of a random sample of five
students from that course. You have a sample size in this case of five. First student gets
the number 1 for (i), the second student gets number 2 for (i). (i) is the index number.




First student got an 8. Second student failed the exam and got an 4 and so further.
When you see a question mark, something that needs to be filled in  do this yourself.

Deviation scores can be usefull to not just look at the values that the different cases get
for their observations but also look at how far we moved those values are from the overall
mean value in our sample or population. That is what the deviation score measures. It tells
you what the difference is between the observed value for each case and the mean value.

Deviation score  Take the mean value on your variable X and subtract from that the
observed value for that specific case.
The squared deviation (x2) scores will always be positive. The sum will not be zero by
definition.

The sum of all these squared deviation scores is 23.2 and this number tells us something
about how big the deviations are across all observations relative to the sample mean
whether they are super close to the sample mean or a bit more far removed from the
sample mean.

Mean is a measure for central tendency

Dispersion
Dispersion is also a descriptive statistic  it describes something about our variable. It is
describing the variability of that variable, so how much do the individual observations differ
from each other overall. How much variation is there in our data.
Tries to connect a number to the variability (hoeveel verschillende antwoorden worden er
gegeven? Als er veel verschillende antwoorden zijn dan is er sprake van veel variabiliteit).

There are four of them  variance, standard deviation, range and interquartile range.
Range = the difference between the highest and lowest value.
Interquartile range = If you would line up all the observations from small to large and we
would chop the data into 25% and 75% of the data (Q1 en Q3). The interquartile range
indicates what would be the value that indicates those two breaking up points of the data.
Both are not very important (IQR and range).
3

, Population variance = add together all the individual cases until you have added up all the
cases in the population. Xi minus mu to the power of 2  this is the squared deviation score.
We divide this by the population size.

Sample variance = If you want to mean the sample variance, you have to divide by n
minus 1 always. The minus one has to do with the concept of degrees of freedom. It has to
do with how much unique information you can draw from a given set of fine night
observations.

WAAROM n MINUS 1?  Variance based on a sample then what I actually do is that I limit
the amount of observation that can vary by limited it by a sample and that means that that
sample is going to show a little bit les variability then you actually want it to show when it is
supposed to reflect the right variance sow to compensate for that, we divide by a number
that is a little bit smaller so we get a little bit of a larger sample for the sample variance to
blow it up to the right size. Mathematically it makes sense to do n minus 1.

Why square root (wortel)>  that is the reverse of a square. So if you take a sqare of a value
or you want to go back to the original value you take the square root.

If we don’t have much variablilty then the standard deviation will be small and the curve will
be very thin and peaky.

Calculating the sample variance is  add together all the individual observations for our
variable X (in this case the grade) minus the mean value. Take the squares of this
difference (square deviation scores) and take the sum of those and divide this by the
sample size minus 1.
The sum of square deviation score here is 23.2 divided by the sample size of 5 minus 1.

The average square distance to the average grade of 5.4 for this sample is 5.8, that is the
variance of grades.


Inference: from sample to population (1)
We observe things but we want to say something about the thing we do not observe.
Inference means taking a sample and using that sample to say something about a larger
population.

Het is de vraag of dit kan: kunnen we de sample uit de populatie wel gebruiken om
daadwerkelijk uitspraken te kunnen doen over de hele populatie. Bijvoorbeeld  als docent
ga je vijf leerlingen nakijken van de dertig. Het gemiddelde van de eerste vijf is een 5.4 dus
de overige leerlingen geef je een 5.4 als eindcijfer.

We can use our sample mean as a (point) estimate (statistic) to make inferences about the
population parameter.
The sample mean is called an estimate (it is not a true population, it is not exactly correct).
Point estimate  when it is a basic number, a single number. Sometimes also referred to
as a statistic.
Population parameter  this is when we talk about the population value that we want to
say something about. For example the sample mean. This one is usually unknown.

We can use this estimates to make an inference about population parameters that we
actually say something say about. The population parameter won’t be known cause you
haven’t observed the whole population.
Sample mean is something that you can observe. You can calculate this (it is known).


4

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller maudwigink. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $7.06. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

81298 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling

Recently viewed by you


$7.06
  • (0)
  Add to cart