Summary for the course, statistical methods at VU.
40 views 1 purchase
Course
Statistical Methods (XB_0080)
Institution
Vrije Universiteit Amsterdam (VU)
This summary captures all lecture video's (in the year , during the COVID breakout) of the statistical methods course, given in the second year of the bachelor AI. It captures all video's, however some examples are often not included, and some are.
Video 2, R
● (1:2) means a vector.
● or (x=1:3) means x has the values 1 to 3
● We can plot with plot(x=1:3, foo(1:3)) for example.
● With help(“ “) you can search documentation for certain functions and functionalities.
● With type=”l” you get a line in the plot.
● main=”...” enables to give a title to the plot.
● xlab and ylab enables us to specify labels on the y and x axis.
● rnorm(10) givus us 10 random numbers for a normal distribution.
○ To keep the same values instead of having everytime new ones, use
set.seed()
● t-tests we do with t.test()
● par(mfrow=c(2,2)) means we have 2 plots next to each other plotted, which is handy
for comparing things.
Video 3, statistics and critical thinking
What is statistics?
● Statistics is the science of data:
○ The study of collecting, organising, analysing, interpreting and presenting
data.
● We use statistics to gain information about a group of objects (population) and/or to
make decisions and predictions.
●
○ We collect data from the population.
○ When you collect data from the whole population that’s called a census.
■ But you want a subset, not everything.
, ■
● We draw conclusions from the sample.
● The sample has to be a representation of the population.
● A statistical study has 3 parts:
○ Prepare
■ context
■ Source
■ Sampling method
○ Analyse
■ Graph data
■ Explore data
■ Apply statistical methods
○ Conclude
● Doing statistics requires critical thinking.
○
○ Common flaw is having a bad sampling method.
■ You should choose a method such that the sample from the population
represents the population.
■ Sample is a subcollection of a population, so different samples →
different data.
● Hence possibly different conclusions about population.
■ A sample should be representative (same characteristics as
population) and unbiased (no systematic difference with population).
● Then we should have the same data as we would have used
the whole population.
○ Another flaw:
, ■
■ The difference here seems quite large, but that’s because the y-axis
does not begin at 0.
○ Another flaw:
■ Correlation does not imply causation.
■ Other variables can influence a correlation.
Video 4, statistics and critical thinking
Collecting sample data
● Voluntary response sample:
○ Subjects decide themselves to be included in sample.
○ But is biased, because only people who feel like it answer.
● Random sample:
○ Each member of population has equal probability of being selected.
○ Is unbiased and gives a better representation of the population.
● Simple random sample:
○ Each sample of size n has equal probability of being chosen.
○ Is unbiased
○ But hard to do in practice when you for example have a very large population.
● Systematic sampling:
○ After starting point, select every k-th member.
○ It is easy to manipulate the outcome.
■ This makes it dangerous because outcomes can be influenced.
● Stratified sampling:
○ Divide population into subgroups such that subject within groups have same
characteristics, then draw a (simple) random sample from each group.
● Cluster sampling:
○ Divide population into clusters, then randomly select some of the clusters.
○ May lead to biased data which not represents the data.
■ To decrease the risk it is important to have a large dataset.
● Convenience sampling:
○ Easily available results
○ For example family
Part 2, important concepts:
● Variable:
, ○ Varying quantity
● In cause and effect studies:
○ Response (dependent) variable:
■ Representing the effect to study
○ Explanatory (independent) variable:
■ Possibly causing that effect
○ Confounding:
■ Mixing influence of several explanatory variables on response.
○
■ It is very important to investigate the significance of the confounding
variables.
Video 5, types of data
Part 2 different types of study:
● Observational study:
○ Characteristics of subjects are observed; subjects are not modified.
○ Retrospective (case-control): data from past
○ Cross-sectional: data from one point in time.
○ Prospective (longitudinal): data are to be collected.
● Experiment: some subject treatment
○ Sometimes control and treatment group: single-blind or double blind,
○ To measure placebo effect or experimenter effect.
Types of data
● Parameter:
○ Numerical measurement describing a population’s characteristic.
○ Notation: typically Greek symbols.
● Statistic:
○ numerical measurement describing a sample’s characteristic.
○ Notation: small letters like x and s.
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller freekcool. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $7.54. You're not tied to anything after your purchase.