Basics
Random variables;
• Bernoulli: Binary based on a 0/1 outcome Male or female
• Discrete: finite variable based on a multiple number outcome Dice
• Continuous variable: infinite value Stock price
Probability distribution function (PDF) is based on the probability of
an outcome in a zero to one value. Given by the joint PDF of (x,y)
where both are independent. P(X=x,Y=y). Beneath the function the
area is stated as the probability.
The Cumulative distribution function (CDF) simply is the same but in a
cumulative manner, showing a progressing line instead of a parabolic.
The normal distribution itself is denoted as X~N(μ,σ2) Where within brackets it shows (mean,
variance). It a s possibility to standardize the scores to make comparison more reliable (comparing a
𝑋−𝜇
grade of B to a grade of a 9) which then is denoted as ‘z’. 𝑧 = 𝜎 ~N(0,1)
Central tendency:
• E(x)=… which means the expected value/ mean of the random variable
Dispersion:
• Var(x)= σ2x, where the variance measures the distance from the mean
• SD(x)= σx
Association:
• Cov(x,y) = σxy , which will be zero if both x and y are independent of each other.
• Corr(x,y)= ρxy , which will be zero if x and y do not have a relationship to one another.
Hypothesis testing can be done using the Chi-square ( 𝑋 = ∑𝑛𝑖=1 𝑍𝑖2 ), T-test and F-test
Matrix: Vector is a row of numbers, a scalar a single number. The matrix itself is a block of numbers.
Example:
3 7
3 −2 0
A = −2 5 =
7 5 1
0 1
Example inverse:
𝐴 𝐵 1 𝐷 −𝐵
𝐴𝐷−𝐵𝐶 ∗ ( )
𝐶 𝐷 −𝐶 𝐴
Bivariate CLM (Cross-sectional data)
𝑦𝑖 = 𝛼 + 𝛽𝑖 𝑋𝑖 + 𝑢𝑖 where this represents the main population determinant.
As such that CEO salary (y) is determined by the intercept (α), the estimation on ROE (β) and any
other determining factor (u).
Y=dependent, α=constant, β=estimator, X=independent and U=error.
This regression can then be estimated using Ordinary least squares (OLS) which calculates the
vertical distance between the fitted line and a point (𝑢̂𝑖). This distance then gets squared and the
estimation coefficient captured as ‘beta’ is the estimate that minimizes the squared residuals.
Linear: CEO salary=963.19+18.50(ROE)+Ui
which is a level-level interpretation. The constant shows that if the Return on equity equals zero, the
CEO salary will be $963,190 (Salary is in thousands). A one percentage point increase in ROE will
cause an $18,500 increase in salary.
, Quadratic: CEO salary=1003.79+2.035ROE+0.278ROE2
Which is a level-level interpretation. The change in the CEO salary value is dependent on the initial
value of ROE. The eventual determinant of Y will be 2.035+0.56ROE, which is a one-unit point
increase. Meaning that a positive X2 shows an increasing rate and a negative X shows a decreasing
rate.
Logarithm: Log (CEO salary) = 6.71+0.014ROE
Which is a Log-level interpretation. The change in y is thus measured in percentages compared to x.
a one-unit increase in ROE (percentage point) will cause a 100*β% increase in y. Thus, the semi-
elasticity of y to x will be 0.014*100=1.4%. Therefore, a one percentage point increase in ROE results
in a 1.4% increase in CEO salary.
Log (CEO salary) = 6.49+0.17 log (ROE)
Which is a log-log interpretation knowing the parameter of log will be bigger and hence ‘easier’ to
read and interpret. A one percent increase in ROE causes a 0.17% increase in salary. Which is the
elasticity of Y with respect to X.
Level-level y x ∆y=β∆x
Level-log y Log (x) ∆y=(β/100) %∆x Semi-elastic
Log-level Log (y) x %∆y=100*β∆x Semi-elastic
Log-log Log (y) Log (x) %∆y=β%∆x Elastic
OLS assumptions
1. The model is linear in parameters
Yi=α+βxi+ui
2. Random sample from the population
OLS cannot be trusted if we only take a particular sample out of the population. For example,
only the highest CEO-sample. Then, on average, the found relation is not the true one.
3. Sample variation in the explanatory variable (x)
If x (ROE) varies in the population, it should as well in the sample.
4. Error term must have an expected value of zero given any x
E(U|X)=0, meaning that the unobserved factors in u is fixed for any x and has no relationship.
If Assumptions 1 to 4 hold we can speak of unbiasedness. Meaning that the estimate of beta
(𝐸(𝛽̂ ) = 𝛽. Whenever there is a bias, a term should be added to this formula which then not equals
𝑛
∑ (𝑋𝑖−𝑥̅ )∗𝐸(𝑈𝑖)
zero. 𝐸(𝛽̂ ) = 𝛽 ∗ ( 𝑖=1
∑𝑛 (𝑋𝑖−𝑥̅ )2
). Yet, a fifth assumption is needed.
𝑖=1
5. Homoscedasticity
The variance of error u is constant and finite for every value of x. Given as Var (U|X)=σ2 < inf.
̂
𝜎
Which is then given as; 𝑆𝑒(𝛽̂ ) =
√∑𝑛
𝑖=1(𝑋𝑖−𝑥̅ )
2
The measurement of how much the variables explain on y is named goodness of fit (R-squared). It is
calculated by (ESS/TSS)=1-(RSS/TSS) where ESS=explained model, TSS=total model and RSS=residual.
Hypothesis testing: known as testing for statistical significance needs the main assumption that the
beta is normally distributed. Which is then assumption 6;
6. Normal distribution of the estimating beta.
The population error (u) is independent of x and normally distributed μ~N(0,σ2)