100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
Samenvatting van de hoorcollege(s) van Quantitative Methods €8,49   In winkelwagen

Samenvatting

Samenvatting van de hoorcollege(s) van Quantitative Methods

 30 keer bekeken  0 keer verkocht

Samenvatting van alle hoorcolleges tijdens het vak Quantitative Methods. Per thema is het samengevat waarbij de literatuur ook is raadgepleegd.

Voorbeeld 3 van de 24  pagina's

  • 11 januari 2023
  • 24
  • 2022/2023
  • Samenvatting
Alle documenten voor dit vak (3)
avatar-seller
evavelthuijs
Samenvatting Quantitative
methode
Hoorcolleges

Theme 1: Intro, Variables and Techniques, OLS
Inductive research is quantitative research, a research were you look at new methods or
create them. Deductive research is qualitative research, a research about the current
theories and methods.

Validity is whether an instrument measures what it sets out to measure.
Reliability is whether an instrument can be interpreted consistently across different situations.

There are different variable types and methods of analysis:
1. Response variable (dependent variable) VS explanatory variable (independent
variable)
Dependent variable is a variable thought to be affected by changes in an independent
variable. You can think of this variable as an outcome.
Independent variable is a concept that you are using to explain the concept. For
example: size, age
2. Manifest VS latent variable
Manifest variable is a concept that is measurable which is directly use in the analysis,
think of age or gender.
Latent variable is a concept that is not directly observable like globalizing –
international migration

It is important to know which variable you need to take and combine a level of measurement
There are three different levels of measurement:
- Nominal
The values are equal like the term ‘color’
We use a frequency and there is no order or mean.
- Ordinal
The values are not equal like the term ‘satisfaction or rank’
Ordinal values have an order, but this order may not be equal.
- Interval/Ratio (Metric)
Are things that can be measured like the mean, medium, weight or age.

Dependent variable Method
Metric (Interval + ratio) Linear regression (OLS)  Theme 1
Ordinal Ordered logit regression
Nominal Multinomial logistic regression
- Binary/binomial Logistic regression  Theme 2
Changes in time (metric) Time series analysis  Theme 4
Spatial dimension Spatial Analyses  Theme 4

Linear regression
- The conditional expectation of a continuous variable Y is expressed as a linear function
of the explanatory variables X1, X2
- Specific observations deviate randomly form the expected value, so we add a random
error term to the model

,- The regression line is estimated with help of the least squares method: take the line, for
which the sum of squared residuals is as small as possible. The residual is the difference
between the observed and the predicted value.

R-Square (goodness-of-fit) measures how well the model fits the observations, the share of
the variation of Y that is explained by the model. If the dots are perfectly on the linear line
than we would have a R-Square of 100%. Through the R-Square you can answer two
questions; can we use the model and how good does the model fit.

Check model assumptions before you draw the conclusions. You need to look if the model
has a ‘good fit’ before you use it.
1. The sample consists of independent observations. The indicators need to be
independent. For example age and size are independent.
2. A linear model is suitable, that is, the relationship between the dependent variable
and the independent variables is linear.
3. The variance of the residuals is equal for all possible values of the independent
variables (constant variance or homoscedasticity  similar variances in different
groups being compared)
The residuals observation needs to be around the 0-line throughout the spectrum,
otherwise the test are unreliable for a certain range
4. The residuals are normally distributed
The idea of a bell shaped distributed, the importance of it is that it is a requirement for
us to use hypotheses testing. The average shows us that one standard deviation
covers 2/3 of the whole observation. 95% of the distribution are with two standard
deviations. 2/3 standard deviations covers 100% of the distribution.
 Linear regression models that predict non-metric dependent variables fail to meet these
conditions. Therefor we use non-linear regression models; the discrete choice models.




Outliers are the dots that are further away from the linear line and any of the other dots. The
outliers are dots that don’t have a relation with other dots. The outliers have impact on the
model if there are too many. The outliers can make the model different because the linear
line can be lower because of the outliers.
How can you detect outliers? Look at the observations beyond three standard deviations of
the mean or look at the boxplots, histograms, probability plots or scatter plots.

Study impact of influential cases:
- Idea is to compare regression outcomes with an without influential cases
- SPSS: influence of case on individual coefficients (DFBETA) and on the model fit (DFFIT)
- Influential cases with Cook’s distance > 1
Cook’s distance on SPSS: if it is higher than 1, than there could be an outlier

Test for multicollinearity

, Multicollinearity is the correlation between two (or more) explanatory variables is too high
and you can’t explain which of the variable (X or Y) would cause the problem or is
connected. You can calculate it with the R (R < 0,8 or 0,9).
- The problem with multicollinearity is that standard errors of regression coefficients
increase. They are untrustworthy coefficients.
- There is a limits size of R
- The interpretation of relevance of individual explanatory variables becomes impossible

Rules of thumb for detection:
VIF > 10 (or tolerance < 0,1)  indicates serious problem of multicollinearity
VIF substantially higher than 1 (or tolerance < 0,2)  multicollinearity may be a problem.

Linear regression: Model extensions and alternative model specifications
1. Dummy variables
Dummy variables are to include qualitative variables in regression. It is mostly an upward
parallel shift of downward parallel shift. With the dummy variables you can better look at the
results and divide them in different groups. The dummy variables fix the model because you
can see which results in which categories is placed and you can easily see the difference.
With the fixed effects you filter the categories out of it. The downside of dummies is that it
filters the groups which makes it hard to analyze. The dummies captures everything and you
don’t see what it is really. You don’t know what is driving and what the main cause is. The
dummy contradicts the problem but not the story behind it. Dummies is reliable but you don’t
know what they capturing.
We always use on dummy less than the amount of categories because one dummy is the
reference category.

When do you need to use dummy variables?
Independent variable Use of dummy variable?
Continuous Not necessary
Ordinal Not necessary
Dichotomous Yes
Nominal (more than two categories) Create help variables using dummies
(number of dummies = number of
categories minus 1)

The fixed effects model
A fixed effects model includes a set of dummy variables each of which represents a group. It
is useful to distinguish between within group and between group effects and to control for
intergroup variation that blurs interpretation of effects that are central in study.

2. Interaction variables
We speak of an interaction if the effect of an independent variable is influenced by a second
independent variable. The interaction effect is the effect that the relationship between two
variables is influenced by an extra variable. The independent variable is influenced by a
second independent variable. In the linear model there will be an interaction term added to
divide the section of the variable. For example you can divide the variables in low prior
education and high prior education. We speak of an interaction when the regression line is
not parallel.

3. What to do in case of non-linearity?
1. Add non-linear term
Quadratic regression model
2. Transformation of variables
Logarithm, square foot, reciprocal of number
3. Other model specifications

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper evavelthuijs. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €8,49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 79271 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen
€8,49
  • (0)
  Kopen