This is a dense summary of the Communication Science Bachelor course, Statistical
Modelling for Communication Research. The summary comprises dense notes from all 11
chapters, including notes and explanations of the micro-lectures, key terms of each session,
as well as the entire book chapter itself. The student still strongly encourages reading the
book and viewing the micro lectures alongside this summary. This summary also includes
simplified SPSS instructions and interpreting, in the student’s own words.
Statistical Modelling for Communication Research
Chapters 1-6
Chapter 1 Sampling Distribution: How Different Could My Sample Have Been?
SMCR session 1 video:
● Inferential statistics help us to generalize the conclusion.
● Sampling distribution - the crucial link between population and sample, without it we
cannot generalize from the sample to the population.
● In the sampling distribution, we focus on samples.
● Expected value = mean of the sampling distribution, which is equal to the true
population value.
● The sample must be a random sample, the sample statistic must be an unbiased
estimator of the population, continuous versus discrete or probability density versus
probabilities, however impractical in the case of a survey or an experiment.
All key terms of this session:
● Sample statistic: a number describing a characteristic of a sample.
● Sampling space: all possible sample statistic values.
● Sampling distribution: all possible sample statistic values and their probabilities or
probability densities.
● Probability density: a means of getting the probability that a continuous random
variable (like a sample statistic) falls within a particular range.
● Random variable: a variable with values that depend on chance.
● Expected value/expectation: the mean of a probability distribution, such as a
sampling distribution.
● Unbiased estimator: a sample statistic for which the expected value equals the
population value.
From the chapter itself:
● Statistical inference is about estimation and null hypothesis testing.
● To make an informed decision on the confidence interval or null hypothesis, we must
compare the characteristic of the sample that we have drawn to the characteristics of
the samples that we could have drawn, which then constitute a sampling distribution.
● Inferential statistics offers techniques for making statements about a larger set of
observations from data collected for a smaller set of observations.
● Each sample has one outcome score on the sample statistic.
● Discrete probability distribution means that only a limited number of outcomes are
possible.
, ● Probabilities can be referred to both as proportions (between 0 and 1) and as
percentages (between 0% and 100%).
● Expected value - equal to the proportion of yellow candies in the population - equal to
the mean of the sampling distribution
● The expected value is the average of the sampling distribution of a random variable.
● A sample is representative of a population if variables in the sample are distributed in
the same way as in the population.
● We should expect it to be representative, so we say it is in principle representative or
representative in the statistical sense of the population.
● Continuous variable = weight. Because we can always think of a new weight
between two other weights.
● Within a continuous sample statistic, we look at a range of values instead of a single
value.
● A probability density function can give us the probability of values up to (and
including) a threshold value, which is known as a left-hand probability, or the
probability of values above (and including) a threshold value, which is called a
right-hand probability.
● Probabilities always sum up to 1
● The population and the sample consist of the same type of observations.
Chapter 2 Probability Models: How do I get a sampling distribution
SMCR session 2
● Three ways of constructing a sampling distribution when drawing one sample:
bootstrapping, the exact approach, and theoretical approximation.
● Bootstrapping - draw one sample, draw bootstrap samples from the original sample,
as large as the original sample, sampling with replacement.
● Bootstrap is only correct if the original sample is more or less representative of the
population, this is bootstrapping’s main limitation.
● The exact approach - calculating the exact probabilities of all sample results.
Limitation - only possible with categorical variables, computer-intensive. However, it
is the true sampling distribution.
● Theoretical approximation - in practice we do not draw samples that are very large,
theoretical approximation approximates the normal distribution.
● Rules of thumb for minimum sample size, and other requirements. Each distribution
has its own.
● Theoretical approximation: approximation, not the true sampling distribution. Poor
approximation if the conditions have not been met.
● Is there a theoretical approximation? Have the conditions been met? Then use the
theoretical approximation.
● If there is no theoretical approximation, and the variables are categorical, use the
exact approach.
● Use bootstrapping if the latter two are not met.
All key terms of this session:
● Bootstrapping: sampling with replacement from the original sample to create a
sampling distribution.
, ● Exact approach: calculating the true sampling distribution as the probabilities of
combinations of values on categorical variables.
● Theoretical approximation: using a theoretical probability distribution as an
approximation of the sampling distribution.
● Independent samples: samples that can in principle be drawn separately.
● dependent/ paired samples: the composition of a sample depends partly or entirely
on the composition of another sample.
From the chapter itself:
● How do we create a sampling distribution, if we only collect data for a single sample?
Three ways: bootstrapping, exact approaches, and theoretical approximations.
● Bootstrap samples are the large number of samples drawn from our initial sample.
● This sample is also the same as the original sample and must be just as large
● Drawing a sample without replacement, the sample is identical to the initial sample. It
is not useful or interesting.
● Drawing a sample with replacement, an observation can be drawn more than once.
For example, the same candy number may appear more than once in the new
sample. Each new sample can be different. Allows us to create a meaningful
sampling distribution.
● In actual research, we sample without replacement, as we never want the same
respondent to participate twice in our research because this would not yield new
information.
● Our statistical software samples with replacement.
● For larger samples, we can trust bootstrapping more.
● The big advantage of bootstrapping: we can get a sampling distribution for any
sample statistic that we are interested in.
● The exact approach calculates the exact probabilities. It is the true sampling
distribution itself. Only possible with discrete or categorical variables. They are also
available for the association between two categorical variables in a contingency table
(chi-square). Exact approaches are said to be computer-intensive.
● The fisher exact text is an example of an exact approach to the sampling distribution
of the association between two categorical Vs.
● Most statistical tests use a theoretical probability distribution as an approximation of
the sampling distribution.
● Theoretical probability distributions are plausible models for sampling distributions,
under particular circumstances or conditions. In order to use it, we must assume that
the conditions for its use are met. A larger sample produces a sampling distribution
that is more peaked, meaning it is closer to the true population value.
● The parameter value (population proportion) is equal to the average of the sampling
distribution because it is an unbiased estimator.
● For most theoretical probability distributions, the sample size is important, namely,
the larger the better.
● We use bootstrapping or an exact test if the conditions for a theoretical probability
distribution have not been met.