100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
Summary of Business Analytics $5.18   Add to cart

Class notes

Summary of Business Analytics

 111 views  4 purchases
  • Course
  • Institution
  • Book

This document is a summary of the lectures by dr. Bettina Siflinger. The course is a mayor course in the joint Data Science bachelor of Tilburg University and Technische Universiteit Eindhoven. It contains a summary of the lecture notes and remarks from the teacher.

Last document update: 3 year ago

Preview 4 out of 34  pages

  • January 12, 2021
  • February 1, 2021
  • 34
  • 2020/2021
  • Class notes
  • Dr. betinna siflinger
  • All classes
avatar-seller
JBM040: Business Analytics
Quartile 2: 2020 – 2021
Teacher: dr. B.M. Siflinger b.m.siflinger@uvt.nl



Business analysis is the ability of firms/organizations to collect, analyse and act
on data




1

,Introduction & Important concepts in probability and statistics
Part 1. Introduction
There are two estimation problems:
1. Prediction: Develop a formula for making predictions about the dependent
variable, based on the observed values of the independent variables.
General question you ask yourself: What happens?

2. Causal analysis: Independent variables are regarded as causes of the
dependent variable. The goal is to determine whether a particular independent
variable really affect the dependent variable, and to estimate the magnitude of
that effect, if any.
General question you ask yourself: Why does it happen?

Now, consider the data generation process for a linear model: y=β 0 + β 1 x 1 +..+ β k x k
with outcome: y , regressors: x 1 , .. x k, and “true” parameters: β 0 ,.. , β k . Its error
term is: u N ( 0 , σ 2 I ) .You should make an assumption for relationship of x=x 1 ,… , x k
and u : E ( u|x )=0.
- E(u∨x) indicates if x and u are dependent or not.
- I is the identity matrix, with only 1’s in the diagonal.

The main goal of OLS is to obtain the estimates ^β 0 , β^ 1 , … , β^ k that minimize sum of
squared residuals.

OLS has two goals with respect to the two estimation problems. They have
different quantities of interest, but the same calculations are involved:

Predictive modelling: Estimate conditional mean E( y∨x) .
^
E ( y∨x )= ^β 0 + ^β 1 x 1+ …+ ^β k x k

Causal estimation: Estimate partial derivative (slope parameter) with respect
to some x j .
^
∂ E ( y|x ) ^
=β j
∂ xj

Both of the goals can be achieved simultaneously by OLS under the condition of
the assumption of zero conditional mean: E ( u|x )=0 .
E ( y|x )=E ( xβ +u|x )=xβ + E(u∨x)

The prediction procedure is interested in the regression line that fits the data as
close as possible. E(u∨x) does not play a role because the prediction is based on
the things that you observe, which E(u∨x) is not. Now it is possible to obtain the
best fit to the data according to least squares criterion

Causal estimation is interested in a particular β j . The causal interpretation of β j
fails if E ( u|x j ) ≠ 0, because the partial derivative with respect to E(u∨x) must be
zero because otherwise we do not get β . Instead get a biased estimate of β j .

All these methods can be used in econometrics.
Econometrics: “based upon the development of statistical methods for estimating
economic relationships, testing economic theories, and evaluating and implementing


2

,government and business policy” . It has the goal to infer that one variable has a
causal effect on another variable. You can use the ceteris paribus analysis.
Investigate the effect of x j on y when all the other factors are fixed. For example:
Problem: There is mostly observational data available
Solution: Impose assumptions to simulate ceteris paribus analysis.
Make sure
that x j and u are independent.
In an exercise, it could be that a regression can be found based on two
parameters. However, there can be other factors that influence the outcome.
Due to omitted variables bias, the estimated regression coefficient b is. This b^ is
only unbiased if cov ( x , u )=0:

^ cov ( x , y ) =cov ( x , xb+u ) =b+ cov ( x ,u )
b=
cov (x , x ) cov ( x , x) cov ( x , x)

Part 2. Probability theory: Random variables
The probability distribution is a function that describes the probability of
obtaining possible values that a random variable X can take on. In addition, the
discrete random variable is a list of outcomes x 1 , … , x k with their probabilities
p1 , … , pk . The continuous random variable is a variable that takes value in a
continuum.

These random variable have an expected value E( X) or μ, which is the average
of all possible values of X . The calculation of the estimate value is different for
different type of random variables:
k
- Discrete RV: E ( x )=∑ x j p j
j=1

- Continuous RV: E ( x )= ∫ x f ( x ) dx
−∞


This calculation has some properties:
 A constant c : E ( c )=c
 Constants a and b : E ( aX +b )=aE ( X ) +b
 (a 1 , … , ak ) are constants, (X 1 , … , X k ) are random variables:
n n
E
(∑ ) ∑
i=1
ai X i =
i=1
a i E( X i)


The variance says something about the distance from X to its mean μ.
2 2 2 2
Var ( X )=σ =E [ ( X−μ ) ] = E ( X )−μ
It has properties:
 Constant X : Var ( X )=0
 Constants a and b : Var ( a+ bX )=b 2 Var ( X )
 Standard deviation: sd ( X )=√ Var ( X)

The covariance measures the linear dependence between the random variables
X and Y .
Cov ( X , Y )=σ xy =E [ ( X−μ x )( Y −μY ) ]=E ( XY )−E ( X ) E(Y )



3

, It has properties:
 If X and Y are independent: Cov ( X , Y )=0
 Constants a 1 , b1 , a2 , b2: Cov ( a1 X+ b1 , a2 Y + b2 )=a1 a2 Cov ( X , Y )

The correlation coefficient is an indicator of how much two random variables
correlate. This value always lays within the range [−1 ,1].
Cov ( X , Y ) σ xy
Corr ( X , Y )= =
sd ( X ) sd (Y ) σ x σ y
It has properties:
 Cov ( X , Y ) and Corr ( X , Y ) have the same sign
 Cov ( X , Y )=0→ Corr ( X , Y )=0

The properties of the variance of sums of random variables:
 Constants a and b : Var ( aX + bY )=a 2 Var ( X ) +b 2 Var ( Y ) +2 ab Cov ( X , Y )
 X and Y uncorrelated: Var ( X +Y )=Var ( X ) + Var ( Y )=Var ( X−Y )
 X 1 , … , X n parwise uncorrelated random variable and a i :i=1 , … , n are
n
2
constants: Var ( a1 X 1+ …+ an X n ) =∑ ai Var( X i)
i=1
The conditional expectation of the relationship between X and Y is denoted by
E(Y ∨X ). Calculate Y which is related to X .

It has properties:
 Function c ( X): E ( c ( X )|X )=c ( X)
 Functions a ( X ) and b (X ): E [ a ( X ) Y +b ( X )| X ] =a ( X ) E ( Y | X ) +b( X)
 X and Y are independent: E ( Y |X )=E(Y )

The Law of iterated expectations (LIE): E ( E ( Y |X ) )=E(Y ). The E(Y ) is a
n
weighted average of the E(Y ∨X =x j) with weights p j → E (Y )=∑ pk E(Y ∨X=x k ).
k=1


Part 3. Finite sample properties
From here on, random variables are also notated as lower case letters. Finite
sample properties are the properties of an estimator that holds for any sample
size. Take a random sample ( y 1 , y2 , … , y n) from a population distribution
depending on unknown parameter θ . An estimator of θ is a rule that assigns each
possible outcome of the sample a value of θ :
n
1
 Natural estimator for μ (mean): y= ∑y
n i=1 i
^
 Estimator θ^ for θ : θ=h ( y 1 , y 2 , … , y n ) where h is some function of RV
^
The estimator θ is a RV because it depends on a random sample. It is an
unbiased estimator if E ( θ^ ) =θ for all possible θ . This indicates that unbiasedness
does not depend on the sample size. The bias of an estimator θ^ :Bias ( θ^ )=E ( θ)−θ
^ .

σ2
The sample variance of an estimator is Var ( y )= . In a sequence of unbiased
n
estimators, the one with the smallest variance is preferred.



4

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller datasciencestudent. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $5.18. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

80364 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$5.18  4x  sold
  • (0)
  Add to cart