CHAPTER 1.5, 1.6: Functions and Differential calculus
- The ultimate objective of econometrics is usually to build a model, which may be thought of as a
simplified version of the true relationship between two or more variables that can be described by a
function. A function is simply a mapping or relationship between an input or set of inputs and an
output.
- Polynomial functions: If n = 2, we have a quadratic equation, if n = 3 a cubic, if n = 4 a quartic and
so on. We use polynomials if y depends only on one variable x but in a non-linear way (and so it
cannot be expressed as a straight line). Broadly, the higher the order of the polynomial, the more
complex will be the relationship between y and x and the more twists and turns there will be.
- Exponential functions: It is sometimes the case that the relationship between two variables is best
described by an exponential function – for example, when a variable y grows (or reduces) at a rate
in proportion to its current value x
- Logarithms: There are at least three reasons why log transforms may be useful in data analysis:
o First, taking a logarithm can often help to rescale the data so that their variance is more
constant, which overcomes a common statistical problem known as heteroscedasticity,
discussed in detail in Chapter 5.
o Second, logarithmic transforms can help to make a positively skewed distribution closer to
a normal distribution.
o Third, taking logarithms can also be a way to make a non-linear, multiplicative relationship
between variables into a linear, additive one. These issues will also be discussed in some
detail in Chapter 5.
- What do we actually use differentiation for in finance? A key use relates to the concept of what
happens at the margin – in other words, what is the effect of an infinitesimally small change in x on
y – this is exactly the interpretation of the slope of a function at a specific value of x. In reality, we
usually weaken this slightly to say that the derivative of y with respect to x can be used to measure
the effect of a unit change in x on y. This is a very useful concept that is widely used in measuring
marginal utility, marginal propensity to save as income changes, etc. – for instance, what is the effect
of a one-unit change in wealth upon the utility of an investor? Differentiation relates unit changes in
x to unit changes in y but it will often be of interest to consider what happens to y if x changes by
one percent rather than one unit. This would be measured by an elasticity.
CHAPTER 2.1: Probability and probability distributions
- A random variable is one that can take on any value from a given set and where this value is
determined at least in part by chance. By their very nature, random variables are not perfectly
predictable. Most data series in economics and finance are best considered as random variables,
although there might be some measurable structure underlying them as well so they are not purely
random. It is often helpful to think of such series as being made up of a fixed part (which we can
model and forecast) and a purely random part, which we cannot forecast.
,- The data that we use in building econometric models either come from experiments or, more
commonly, are observed in the ‘real world’. The outcomes from an experiment can often only take
on certain specific values – i.e., they are discrete random variables. For example, the sum of the
scores from throwing two dice could only be a number between two (if we throw two ones) and
twelve (if we throw two sixes). We could calculate the probability of each possible sum occurring
and plot it on a diagram, such as Figure 2.1. This would be known as a probability distribution
function, which shows the various outcomes that are possible and how likely each one is to occur.
- Most of the time in finance we work with continuous rather than discrete variables, in which case
the plot above would be probability density function (pdf) rather than a distribution function. A
continuous random variable can take any value (possibly only within a given range). For example,
the amount of time a swimmer takes to complete one length of a pool or the return on a stock index.
The time that the swimmer takes could be any positive value, depending on how fast they are! The
return on a stock index could take any value greater than –100% – in other words, the most that an
investor in the stock can lose is their entire investment (−100%), but there is no maximum to the
amount that they can gain. Note that for a continuous random variable, the probability that it is
exactly equal to a particular number is always zero by definition because the variable could take on
any value.
- The distribution most commonly used to characterise a random variable is a normal or Gaussian
(these terms are equivalent) distribution. The normal distribution is easy to work with since it is
symmetric, it is unimodal (i.e., only has one peak) and the only pieces of information required to
completely specify the distribution are its mean and variance, as discussed in Chapter 5. The normal
distribution also has several useful mathematical properties. For example, any linear transformation
of a normally distributed random variable will still be normally distributed. Furthermore, any linear
combination of independent normally distributed random variables is itself normally distributed.
- Note that for a continuous random variable, we can only talk of the probability that it will take on
values within a range (e.g., the probability that y will be between 1 and 2) and not the probability
that y will be equal to some number (e.g., 2).
- Distributions are important in statistics because of their link with probabilities. If we know (or we
can assume) the particular distribution that a series follows, then we can calculate the likelihood
(probability) that values this series takes will fall within a certain range.
- In fact, an important rule in statistics known as the central limit theorem states that the sampling
distribution of the mean of any random sample of observations will tend towards the normal
distribution with mean equal to the population mean, μ, as the sample size tends to infinity. This
means that we can use the normal distribution as a kind of benchmark when testing hypotheses, as
discussed more fully in Chapter 3.
- There are many statistical distributions, including the binomial, Poisson, log normal, normal,
exponential, t, chi-squared and F, and each has its own characteristic pdf. Different kinds of random
variables will be best modelled with different distributions. Many of the statistical distributions are
2
, also related to one another, and most (except the normal) have one or more degrees of freedom
parameters that determine the location and shape of the distribution.
- What do we use statistical distributions for? The normal, F, t and chisquared distributions are all
used predominantly to make inferences from the sample to the population. This implies making
statements about the likely values of the corresponding unobservable population values from the
sample values that we have. These ideas will be discussed in considerable detail in Chapter 3
onwards.
CHAPTER 2.3: Descriptive statistics
- When analysing a series containing many observations, it is useful to be able to describe its most
important characteristics using a small number of summary measures.
- The average value of a series is sometimes known as its measure of location or measure of central
tendency.
- So which method for calculating mean returns (arithmetic or geometric) should we use? The answer is,
as usual, that ‘it depends’. Geometric returns give the fixed return on the asset or portfolio that would
have been required to match the actual performance, which is not the case for the arithmetic mean. Thus,
if you assumed that the arithmetic mean return had been earned on the asset every year, you would not
reach the correct value of the asset or portfolio at the end. The reason is that the effect of compounding.
- Note that if the individual annual returns are already continuously compounded, then it would be more
appropriate to use the arithmetic average to calculate overall performance rather than the geometric
average, since with log returns the effect of compounding has already been taken into account.
- But it can also be shown that the geometric return is always less than or equal to the arithmetic return,
and so the geometric return is a downward-biased predictor of future performance. Hence, if the objective
is to summarise historical performance, the geometric mean is more appropriate, but if we want to
forecast future returns, the arithmetic mean is the one to use. Finally, it is worth noting that the geometric
mean is evidently less intuitive and less commonly used than the arithmetic mean, but it is less affected
by extreme outliers than the latter.
- We can see that the arithmetic mean is higher than the geometric mean unless there is zero volatility and
thus it is hardly surprising that it is more common for fund managers to report their arithmetic mean
returns! We can also see that the higher the volatility, the greater will be the difference between the two
measures of average returns, and thus the more the arithmetic average will overstate the investor
experience and how much his or her money would have grown over time.
Measures of spread
- Usually, the average value of a series will be insufficient to adequately characterise a data series, since
two series may have the same mean but very different profiles because the observations on one of the
series may be much more widely spread about the mean than the other. Hence, another important feature
of a series is how dispersed its values are. In finance theory, for example, the more widely spread are the
returns around their mean value, the more risky the asset is usually considered to be.
3
, o Percentiles of a distribution: It should already be obvious that, by definition, the median is
the 50th percentile. The difference between two percentiles can be used as a measure of the
spread of a distribution. A more reliable measure of spread, although it is not widely
employed by quantitative analysts, is the semi-interquartile range, sometimes known as the
quartile deviation.
o Variance and standard deviation: Another, more familiar, measure of the spread or
dispersion of a set of data, the variance, is very widely used. It is interpreted as the average
squared deviation of each data point about the mean value. The squares of the deviations
from the mean are taken rather than the deviations themselves to ensure that positive and
negative deviations (for points above and below the average, respectively) do not cancel
each other out.
- While there is little to choose between the variance and the standard deviation in terms of which is
the best measure, the latter is sometimes preferred since it will have the same units as the variable
whose spread is being measured, whereas the variance will have units of the square of the variable.
- Both variance and standard deviation share the advantage that they encapsulate information from all
the available data points, unlike the range and quartile deviation, although they can also be heavily
influenced by outliers (but to a lesser degree than the range).
- The quartile deviation is an appropriate measure of spread if the median is used to define the average
value of the series, while the variance or standard deviation will be appropriate if the arithmetic
mean constitutes the measure of central tendency adopted.
- Before moving on, it is worth discussing why the denominator in the formulae for the variance and
standard deviation include N − 1 rather than N, the sample size. Subtracting one from the number of
available data points is known as a degrees of freedom correction, and this is necessary since the
spread is being calculated about the mean of the series, and this mean has had to be estimated from
the sample data as well. Thus the spread measures described above are known as the sample variance
and the sample standard deviation. Had we been observing the entire population of data rather than
a mere sample from it, then the formulae would not need a degrees of freedom correction and we
would divide by N rather than N − 1.
- A further measure of dispersion is the negative semi-variance, which also gives rise to the negative
semi-standard deviation. These measures use identical formulae to those described above for the
variance and standard deviation, but when calculating their values, only those observations for which
[yi – yavg] are used in the sum, and N now denotes the number of such observations. This measure is
sometimes useful if the observations are not symmetric about their mean value (i.e., if the
distribution is skewed).
- A final statistic that has some uses for measuring dispersion is the coefficient of variation, CV. This
is obtained by dividing the standard deviation by the arithmetic mean of the series (often multiplied
by 100 to express it in percentage terms). CV is useful where we want to make comparisons across
series. By normalising the standard deviation, the coefficient of variation is a unit-free
4