Garantie de satisfaction à 100% Disponible immédiatement après paiement En ligne et en PDF Tu n'es attaché à rien
logo-home
Summary of Business Analytics €4,74
Ajouter au panier

Notes de cours

Summary of Business Analytics

 111 vues  4 fois vendu
  • Cours
  • Établissement
  • Book

This document is a summary of the lectures by dr. Bettina Siflinger. The course is a mayor course in the joint Data Science bachelor of Tilburg University and Technische Universiteit Eindhoven. It contains a summary of the lecture notes and remarks from the teacher.

Dernier document publié: 3 année de cela

Aperçu 4 sur 34  pages

  • 12 janvier 2021
  • 1 février 2021
  • 34
  • 2020/2021
  • Notes de cours
  • Dr. betinna siflinger
  • Toutes les classes
avatar-seller
JBM040: Business Analytics
Quartile 2: 2020 – 2021
Teacher: dr. B.M. Siflinger b.m.siflinger@uvt.nl



Business analysis is the ability of firms/organizations to collect, analyse and act
on data




1

,Introduction & Important concepts in probability and statistics
Part 1. Introduction
There are two estimation problems:
1. Prediction: Develop a formula for making predictions about the dependent
variable, based on the observed values of the independent variables.
General question you ask yourself: What happens?

2. Causal analysis: Independent variables are regarded as causes of the
dependent variable. The goal is to determine whether a particular independent
variable really affect the dependent variable, and to estimate the magnitude of
that effect, if any.
General question you ask yourself: Why does it happen?

Now, consider the data generation process for a linear model: y=β 0 + β 1 x 1 +..+ β k x k
with outcome: y , regressors: x 1 , .. x k, and “true” parameters: β 0 ,.. , β k . Its error
term is: u N ( 0 , σ 2 I ) .You should make an assumption for relationship of x=x 1 ,… , x k
and u : E ( u|x )=0.
- E(u∨x) indicates if x and u are dependent or not.
- I is the identity matrix, with only 1’s in the diagonal.

The main goal of OLS is to obtain the estimates ^β 0 , β^ 1 , … , β^ k that minimize sum of
squared residuals.

OLS has two goals with respect to the two estimation problems. They have
different quantities of interest, but the same calculations are involved:

Predictive modelling: Estimate conditional mean E( y∨x) .
^
E ( y∨x )= ^β 0 + ^β 1 x 1+ …+ ^β k x k

Causal estimation: Estimate partial derivative (slope parameter) with respect
to some x j .
^
∂ E ( y|x ) ^
=β j
∂ xj

Both of the goals can be achieved simultaneously by OLS under the condition of
the assumption of zero conditional mean: E ( u|x )=0 .
E ( y|x )=E ( xβ +u|x )=xβ + E(u∨x)

The prediction procedure is interested in the regression line that fits the data as
close as possible. E(u∨x) does not play a role because the prediction is based on
the things that you observe, which E(u∨x) is not. Now it is possible to obtain the
best fit to the data according to least squares criterion

Causal estimation is interested in a particular β j . The causal interpretation of β j
fails if E ( u|x j ) ≠ 0, because the partial derivative with respect to E(u∨x) must be
zero because otherwise we do not get β . Instead get a biased estimate of β j .

All these methods can be used in econometrics.
Econometrics: “based upon the development of statistical methods for estimating
economic relationships, testing economic theories, and evaluating and implementing


2

,government and business policy” . It has the goal to infer that one variable has a
causal effect on another variable. You can use the ceteris paribus analysis.
Investigate the effect of x j on y when all the other factors are fixed. For example:
Problem: There is mostly observational data available
Solution: Impose assumptions to simulate ceteris paribus analysis.
Make sure
that x j and u are independent.
In an exercise, it could be that a regression can be found based on two
parameters. However, there can be other factors that influence the outcome.
Due to omitted variables bias, the estimated regression coefficient b is. This b^ is
only unbiased if cov ( x , u )=0:

^ cov ( x , y ) =cov ( x , xb+u ) =b+ cov ( x ,u )
b=
cov (x , x ) cov ( x , x) cov ( x , x)

Part 2. Probability theory: Random variables
The probability distribution is a function that describes the probability of
obtaining possible values that a random variable X can take on. In addition, the
discrete random variable is a list of outcomes x 1 , … , x k with their probabilities
p1 , … , pk . The continuous random variable is a variable that takes value in a
continuum.

These random variable have an expected value E( X) or μ, which is the average
of all possible values of X . The calculation of the estimate value is different for
different type of random variables:
k
- Discrete RV: E ( x )=∑ x j p j
j=1

- Continuous RV: E ( x )= ∫ x f ( x ) dx
−∞


This calculation has some properties:
 A constant c : E ( c )=c
 Constants a and b : E ( aX +b )=aE ( X ) +b
 (a 1 , … , ak ) are constants, (X 1 , … , X k ) are random variables:
n n
E
(∑ ) ∑
i=1
ai X i =
i=1
a i E( X i)


The variance says something about the distance from X to its mean μ.
2 2 2 2
Var ( X )=σ =E [ ( X−μ ) ] = E ( X )−μ
It has properties:
 Constant X : Var ( X )=0
 Constants a and b : Var ( a+ bX )=b 2 Var ( X )
 Standard deviation: sd ( X )=√ Var ( X)

The covariance measures the linear dependence between the random variables
X and Y .
Cov ( X , Y )=σ xy =E [ ( X−μ x )( Y −μY ) ]=E ( XY )−E ( X ) E(Y )



3

, It has properties:
 If X and Y are independent: Cov ( X , Y )=0
 Constants a 1 , b1 , a2 , b2: Cov ( a1 X+ b1 , a2 Y + b2 )=a1 a2 Cov ( X , Y )

The correlation coefficient is an indicator of how much two random variables
correlate. This value always lays within the range [−1 ,1].
Cov ( X , Y ) σ xy
Corr ( X , Y )= =
sd ( X ) sd (Y ) σ x σ y
It has properties:
 Cov ( X , Y ) and Corr ( X , Y ) have the same sign
 Cov ( X , Y )=0→ Corr ( X , Y )=0

The properties of the variance of sums of random variables:
 Constants a and b : Var ( aX + bY )=a 2 Var ( X ) +b 2 Var ( Y ) +2 ab Cov ( X , Y )
 X and Y uncorrelated: Var ( X +Y )=Var ( X ) + Var ( Y )=Var ( X−Y )
 X 1 , … , X n parwise uncorrelated random variable and a i :i=1 , … , n are
n
2
constants: Var ( a1 X 1+ …+ an X n ) =∑ ai Var( X i)
i=1
The conditional expectation of the relationship between X and Y is denoted by
E(Y ∨X ). Calculate Y which is related to X .

It has properties:
 Function c ( X): E ( c ( X )|X )=c ( X)
 Functions a ( X ) and b (X ): E [ a ( X ) Y +b ( X )| X ] =a ( X ) E ( Y | X ) +b( X)
 X and Y are independent: E ( Y |X )=E(Y )

The Law of iterated expectations (LIE): E ( E ( Y |X ) )=E(Y ). The E(Y ) is a
n
weighted average of the E(Y ∨X =x j) with weights p j → E (Y )=∑ pk E(Y ∨X=x k ).
k=1


Part 3. Finite sample properties
From here on, random variables are also notated as lower case letters. Finite
sample properties are the properties of an estimator that holds for any sample
size. Take a random sample ( y 1 , y2 , … , y n) from a population distribution
depending on unknown parameter θ . An estimator of θ is a rule that assigns each
possible outcome of the sample a value of θ :
n
1
 Natural estimator for μ (mean): y= ∑y
n i=1 i
^
 Estimator θ^ for θ : θ=h ( y 1 , y 2 , … , y n ) where h is some function of RV
^
The estimator θ is a RV because it depends on a random sample. It is an
unbiased estimator if E ( θ^ ) =θ for all possible θ . This indicates that unbiasedness
does not depend on the sample size. The bias of an estimator θ^ :Bias ( θ^ )=E ( θ)−θ
^ .

σ2
The sample variance of an estimator is Var ( y )= . In a sequence of unbiased
n
estimators, the one with the smallest variance is preferred.



4

Les avantages d'acheter des résumés chez Stuvia:

Qualité garantie par les avis des clients

Qualité garantie par les avis des clients

Les clients de Stuvia ont évalués plus de 700 000 résumés. C'est comme ça que vous savez que vous achetez les meilleurs documents.

L’achat facile et rapide

L’achat facile et rapide

Vous pouvez payer rapidement avec iDeal, carte de crédit ou Stuvia-crédit pour les résumés. Il n'y a pas d'adhésion nécessaire.

Focus sur l’essentiel

Focus sur l’essentiel

Vos camarades écrivent eux-mêmes les notes d’étude, c’est pourquoi les documents sont toujours fiables et à jour. Cela garantit que vous arrivez rapidement au coeur du matériel.

Foire aux questions

Qu'est-ce que j'obtiens en achetant ce document ?

Vous obtenez un PDF, disponible immédiatement après votre achat. Le document acheté est accessible à tout moment, n'importe où et indéfiniment via votre profil.

Garantie de remboursement : comment ça marche ?

Notre garantie de satisfaction garantit que vous trouverez toujours un document d'étude qui vous convient. Vous remplissez un formulaire et notre équipe du service client s'occupe du reste.

Auprès de qui est-ce que j'achète ce résumé ?

Stuvia est une place de marché. Alors, vous n'achetez donc pas ce document chez nous, mais auprès du vendeur datasciencestudent. Stuvia facilite les paiements au vendeur.

Est-ce que j'aurai un abonnement?

Non, vous n'achetez ce résumé que pour €4,74. Vous n'êtes lié à rien après votre achat.

Peut-on faire confiance à Stuvia ?

4.6 étoiles sur Google & Trustpilot (+1000 avis)

53022 résumés ont été vendus ces 30 derniers jours

Fondée en 2010, la référence pour acheter des résumés depuis déjà 14 ans

Commencez à vendre!
€4,74  4x  vendu
  • (0)
Ajouter au panier
Ajouté