100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
College aantekeningen Quantitative Methods €5,19   In winkelwagen

College aantekeningen

College aantekeningen Quantitative Methods

 9 keer bekeken  0 keer verkocht

College aantekeningen Quantitative Methods

Voorbeeld 3 van de 18  pagina's

  • 7 november 2022
  • 18
  • 2022/2023
  • College aantekeningen
  • Pascal beckers, martin van der velde
  • Colleges linear regressie (deels), logistic regression, sem, temporal analysis en spatial analysis (deels)
Alle documenten voor dit vak (5)
avatar-seller
HilalsahinRU
Lecture LRM

Variables can be split into categorical and continuous, and within these types there are different
levels of measurement:

- Categorical (entities are divided into distinct categories):
- Binary variable: There are only two categories (e.g., dead or alive).
- Nominal variable: There are more than two categories (e.g., whether someone is an omnivore,
vegetarian, vegan, or fruitarian).
- Ordinal variable: The same as a nominal variable but the categories have a logical order (e.g.,
whether people got a fail, a pass, a merit or a distinction in their exam).

Continuous (entities get a distinct score):
- Interval variable: Equal intervals on the variable represent equal differences in the property being
measured (e.g., the difference between 6 and 8 is equivalent to the difference between 13 and 15).
- Ratio variable: The same as an interval variable, but the ratios of scores on the scale must also
make sense (e.g., a score of 16 on an anxiety scale means that the person is, in reality, twice as
anxious as someone scoring 8). For this to be true, the scale must have a meaningful zero point.

- The mean is the sum of all scores divided by the number of scores. The value of the mean can be
influenced quite heavily by extreme scores.
- The median is the middle score when the scores are placed in ascending order. It is not as
influenced by extreme scores as the mean.
- The mode is the score that occurs most frequently.

The variance and standard deviation tell us about the shape of the distribution of scores. If the mean
represents the data well then most of the scores will cluster close to the mean and the resulting
standard deviation is small relative to the mean. When the mean is a worse representation of the
data, the scores cluster more widely around the mean and the standard deviation is larger. Figure
1.11 shows two distributions that have the same mean (50) but different standard deviations. One
has a large standard deviation relative to the mean (SD = 25) and this results in a flatter distribution
that is more spread out, whereas the other has a small standard deviation relative to the mean (SD =
15) resulting in a pointier distribution in which scores close to the mean are very frequent but scores
further from the mean become increasingly infrequent. The message is that as the standard
deviation gets larger, the distribution gets fatter. This can make distributions look platykurtic or
leptokurtic when, in fact, they 72 are not

- The deviance or error is the distance of each score from the mean.
- The sum of squared errors is the total amount of error in the mean. The errors/deviances are
squared before adding them up.
- The variance is the average distance of scores from the mean. It is the sum of squares divided by
the number of scores. It tells us about how widely dispersed scores are around the mean.
- The standard deviation is the square root of the variance. It is the variance converted back to the
original units of measurement of the scores used to compute it. Large standard deviations relative to
the mean suggest data are widely spread around the mean, whereas small standard deviations

,suggest data are closely packed around the mean.
- The range is the distance between the highest and lowest score.
- The interquartile range is the range of the middle 50% of the scores

The standard error of the mean is the standard deviation of sample means. As such, it is a measure
of how representative of the population a sample mean is likely to be. A large standard error (relative
to the sample mean) means that there is a lot of variability between the means of different samples
and so the sample mean we have might not be representative of the population mean. A small
standard error indicates that most sample means are similar to the population mean (i.e., our sample
mean is likely to accurately reflect the population mean).


Article lecture + Q&A Discrete Choice modelling
Important things to consider when you look at the reliability of a scientific journal for your
research + answered for the article lecture

The research aim is entrepreneurial intention differences between north and south Europe.

Database is: 2004 GEM Data on an individual level  it is a survey (individual information) provided
by the people themselves. It is micro data  about themselves (individual)

Data restrictions: regional restrictions (only north and south Europe), data of individuals that are
already active with activity. It must also be people with entrepreneurship intentions within 3 years.
(note: 3 years in this research, because, e.g. 10 years would be too long (idk maybe I will be
entrepreneur)

Dependent variable: Intentions to be an entrepreneur intensions within 3 years

Quantitative analyses method: discrete choice (yes/ no? 0=no 1=yes)

Does the model in the journal meet the conditions?: not entirely, some things are missing. not
looking at residuals, error terms in the model.

How is the overall model evaluated?

Odds ratio, how is that interpreted?

Why is there a separate regression?: first they do wih all data, then separate only north and south:
for the overview. But then it is more difficult to compare because there are now 2 different datasets

Which model is the best? (model fit): the model is significant, but what is the relevance of various
models? So this is about model relevance. Look for the goodness of fit:

Cox and snell Pseudo r-squared (the higher the better)

Look at the negelkerke r squared (the higher the beter)

Percentage correct (classification table): higher the better

For the test: argue why which model is the best: argue with the pseudo r squared, significance and
relevance

, In this journal, it’s either 4 or 5



Notes

Importance of comparative data; comparing data is important to harmonization of data  because it
is impossible to have 1 dataset with every data. Plus if you collect data yourself only for your own
region/country: less valuable; you can do less with the data.

Significance: is a model usefull or not uberhaupt

Relevance: is it relevant for this research

Lecture slide: Result analyses 1 (2)

Odds ratio:

Check which ExpB is different than 1

Significance for odds ratio: whether you can reject the null hypothesis that the odds ratio is 0 (that
there is no effect) so if p <0.05 it is significant

The odds ratio is below 0. The reference is Mediterranean  because you always compare to 1. So
the intentions in mediterranian countries is lower than in Scandinavia. It is 1 – 0.752 = (…) % lower
than in Scandinavia

The categories (age, gender) are dummies

When looking at age, model 5: when the Age goes up with 1 year; their entrepreneurial intentions go
down with 3.9 % (1-0.961x100%)

1 – (..)  because: you have a reference: 1 is your reference, you calculate the difference with your
reference. So f e is your odds ratio is 2.5  compare it to the reference 1 always with odds ratio’s,
but then substract the 1 to see what the difference is. (zoveel keer hoger dan de reference categorie)

4 steps:

1. Testing model assumptions (de lijst met waar het model aan moet voldoen, mogen we het
uberhaupt wel gebruiken?)

2. Is the overall model significant (is it better than an empty model?, think of the classification table)

3. What is the model relevance (how relevant is this model compared to other e.g.)

4. And what are the effect that we’re interested in (in this case: chances of entrepreneurship
compared to education level)

Q&A Part

Natural logarithm (nl): mathematic function that has to to with the power (tot de macht). Om tot het
goeie getal te komen en een lineaire relatie te hebben.

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper HilalsahinRU. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €5,19. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 73918 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen
€5,19
  • (0)
  Kopen