Summary How to do Linguistics with R - Natalia Levshina, Chap. 6, 7, 8, 12, 13
Statistics 2 Notes
short overview statistics
All for this textbook (6)
Written for
Rijksuniversiteit Groningen (RuG)
Communicatie- En Informatiewetenschappen
Statistics (LCX046B05)
All documents for this subject (6)
Seller
Follow
hLianne
Reviews received
Content preview
Natalia Levshina (2015). How to do Linguistics with R. Data exploration and
statistical analysis. John Benjamins:
https://benjamins.com/#catalog/books/z.195/main (EUR 45).
Chapter 1 – what is statistics?
Main statistical notions and principles
1.1 Statistics and statistics
Statistics, as a noun in singular like mathematics, is a set of techniques and tools for describing and
analyzing data. Statistics, as in the plural, are measures obtained from samples.
A population is a group that represents all objects of interest. The values obtained from a population
are called parameters. If the population is too big, one will deal with samples, which are meant to be
representative of the population. The difference between a sample statistic and the corresponding
population parameter is called the sampling error: the smaller, the more representative.
The best method, i.e. most reliable, is random sampling, where everyone of the population
has equal chances to be selected. Other methods are representative sampling, where the
researcher draws a sample in such a way that it matches the population on certain
characteristics, and convenience sampling, sampling to one’s convenience.
Statistics can be subdivided into descriptive statistics, describing the characteristics of a sample, and
inferential statistics, allowing the researcher to use the characteristics of a sample in order to make
conclusions about the population in general (e.g., a statistically significant difference).
1.2 How to formulate and test your hypotheses
1.2.1 Null and alternative hypotheses
Before beginning statistical analysis, a research hypothesis needs to be formulated: the research
hypothesis, your thoughts of the outcome of the research, i.e. alternative hypothesis (H0) together
with the null hypothesis (H1) which says there is no difference between, e.g., the different groups.
The alternative hypothesis can be directional, an assumed direction is expressed (e.g., X is more than
Y), or non-directional, where there is an assumption of a difference but unclear in which direction
(e.g., X is not equal to Y).
1.2.2 Those mysterious p-values…
When the distribution, a collection of scores, or values, on a variable, is normal, it has a bell-shaped
figure/curve. Knowing the shape of a distribution, one can compute the exact probabilities for a
range of x.
The entire area under the curve corresponds to the probability of 1, i.e. 100%.
In case of a symmetric distribution, the middle value, e.g. 110 cm, corresponds the
probability 0.5 or 50%, e.g. 50% is under 110 cm, or 50% is above 110 cm.
1
, The p-value shows the probability of obtaining a given test statistic value or more extreme values if
the null hypothesis is true. If the p-value is smaller than some conventional level (usually 0.05 or
0.01), then the null-hypothesis is rejected and it is to believe that the result is not due to chance.
P<0.05, H0 = rejected, and there is a true difference between, e.g., the groups.
P>0.05, H0 = accepted, so there is no sufficient evidence that the, e.g., groups are different.
The number of the p-value, e.g. 0.05, is the significant level: the degree of risk you are willing to take
that you will reject a null hypothesis that is actually true. It needs to be decided on before the
statistical analysis.
In order to compute the p-value, one has to know the number of degrees of freedom (df): the
number of values that are free to vary, which is often the sample size minus one.
1.2.3 Type I and Type II errors
If H0 is rejected, when it is in fact true, meaning there is no true difference between the groups,
there is a Type I error; ‘false alarm’ or ‘false positive’. If the significance level is 0.05, there is a 5%
chance of rejecting H0 when it is in fact true.
If H0 is accepted, while it is in fact false, meaning there is a true difference between the groups,
there is a Type II error; ‘false negative’.
Decreasing the significance level will decrease the changes of a Type I error, and increase the chances
of a Type II error.
1.2.4 One-tailed and two-tailed statistical tests
The distinction of a (non-)directional H1 is important when one chooses an appropriate statistical
test. Most tests come in two flavors: one-tailed, if H1 is directional, and two-tailed, if H1 is non-
directional.
If H1 is ‘X is greater than Y’, the test statistic
should be somewhere in the blue area. If it
would be ‘smaller’, then the test statistics
should be located on the left.
If H1 is ‘X is different from Y’, you can observe
an extreme result either in the left or right tail.
It is crucial that you formulate your alternative hypothesis and make your choice between one- and
two-tailed tests before you compute any test statistic.
2
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller hLianne. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $3.79. You're not tied to anything after your purchase.