STATISTICS I
Course 19406 – Bachelor’s in Management and Technology
Fernando Alfayate Fernández
2022/2023
,Tabla de contenido
1. INTRODUCTION AND BASIC CONCEPTS .................................................................................................................... 2
1.1 WHAT IS STATISTICS? ....................................................................................................................................... 2
1.2 TYPES OF STATISTICAL VARIABLES ................................................................................................................... 2
2. ANALYSIS OF UNIVARIATE DATA............................................................................................................................... 3
2.1 REPRESENTATIONS & GRAPHS ............................................................................................................................... 3
2.2 NUMERICAL MEASURES TO SUMMARIZE AND DESCRIBE DATA ............................................................................ 7
3. ANALYSIS OF BIVARIATE DATA................................................................................................................................12
3.1 BIVARIATE DATA ...................................................................................................................................................12
3.2 TABULAR METHODS .............................................................................................................................................12
3.3 CHARTS AND NUMERICAL SUMMARY ..................................................................................................................13
4. PROBABILITY ...........................................................................................................................................................14
4.1 RANDOM EXPERIMENTS, SAMPLE SPACE, ELEMENTARY AND COMPOSITE EVENTS ..........................................15
4.2 DEFINITION OF PROBABILITY. PROPERTIES ..........................................................................................................15
4.3 CONDITIONAL PROBABILITY AND MULTIPLICATION RULE. ..................................................................................16
4.4 FUNDAMENTAL THEOREMS OF PROB. CALCULUS: LAW OF TOTAL PROBABILITY AND BAYE´S THEOREM .........17
5. PROBABILITY MODELS ............................................................................................................................................18
5.1 DISCRETE RANDOM VARIABLES............................................................................................................................18
5.2 CONTINOUS RANDOM VARIABLES .......................................................................................................................20
5.3 DISCRETE PROBABILITY MODELS: BERNOULLI, BINOMIAL AND POISSON ...........................................................21
5.4 CONTINOUS PROBABILITY MODELS: UNIFORM, EXPONENTIAL AND NORMAL...................................................22
5.5 CENTRAL LIMIT THEOREM AND APPLICATIONS ...................................................................................................25
6. STATISTICAL INFERENCE .........................................................................................................................................26
6.1 INTRODUCTION & OBJECTIVES .............................................................................................................................26
6.2 POINT ESTIMATION OF PARAMETERS ..................................................................................................................26
6.3 BERNOULLI DISTRIBUTION ...................................................................................................................................27
6.4 BINOMIAL DISTRIBUTION .....................................................................................................................................27
6.5 NORMAL DISTRIBUTION .......................................................................................................................................28
6.6 GOODNESS OF FIT TO A DISTRIBUTION. GRAPHICAL METHODS. ........................................................................28
6.7 DISTRIBUTION OF THE SAMPLE MEAN .................................................................................................................29
6.8 CONFIDENCE INTERVALS ......................................................................................................................................30
Fernando Alfayate Fernández – fernandoalfayate.apuntes@gmail.com
,1. INTRODUCTION AND BASIC CONCEPTS
1.1 WHAT IS STATISTICS?
In everyday language, the term statistics is used to refer to numbers that describe some aspect of the world.
• Economic statistics: number of unemployed, inflation rate, ...
• Demographic statistics: birth rate, life expectancy, ...
• Sports statistics: goals scored, number of red cards in a football match
• Meteorological statistics: temperature, rain,
Statistics is much more than mere numbers: it is the discipline that addresses how to collect, summarize, analyze, and
interpret data, to draw conclusions and make better decisions.
Applications of statistics vary from accounting, finance, marketing, economics, politics or sustainability.
Data are collected feature about a phenomenon under study.
1.2 TYPES OF STATISTICAL VARIABLES
Notation: typically the letters X, Y, Z are used. Example:
X = Number of employees in Madrid firms (upper case in definition)
x1 = 55; x2 = 3000 (lower case for specific values, we add subscripts to indicate individuals)
NOTE: Numerical codes for categorical variables DO NOT make them numerical (ex: Male = 1, Female = 2)
1.2.1 POPULATION AND SAMPLE
Population: complete collection of individuals. In practice it is unusual to study all the individuals of a population:
1. It may be economically unfeasible to study the entire population
2. The study might take so much time that it would be infeasible and, moreover, the population might change
over the time span of the study
3. The study may imply the destruction of individuals
Sample: subset of individuals drawn out from the population. To draw valid conclusions, it must be representative of
the population. The sample selection method is very important. Data sources comprise observations, experiments, and
historical data.
Fernando Alfayate Fernández – fernandoalfayate.apuntes@gmail.com
, 2. ANALYSIS OF UNIVARIATE DATA
2.1 REPRESENTATIONS & GRAPHS
To describe categorical variables, we may use a frequency table. This is an example of a frequency table:
Where the number of employees represents absolute frequencies, and the proportion of employees represents
relative frequencies.
2.1.1 STRUCTURE OF A FREQUENCY TABLE
2.1.2 BAR AND PIE CHARTS, PARETO CHARTS.
BAR CHARTS
Bars are of the same width and equally-spaced, their heights represent frequencies. There are gaps between bars, and
bars are labeled with class names (or codes).
PIE CHARTS
1. Each pie sector is a fraction of the circle
2. Sectors are labeled with their corresponding class names
3. Computer software typically orders classes in alphabetical order
4. Pie charts are visually engaging, but relative sector sizes are harder to assess correctly than in bar charts
5. Avoid 3D pie charts: 3D perspective distorts our perception of relative sector sizes
PARETO CHARTS
• Bar chart in which the variable classes are ranked in decreasing order of frequency
• It only applies to nominal categorical variables
• Useful to identify the more relevant classes
Pareto charts are used for qualitative data, where the charts are ordered according to frequencies. Vertical scales
represent frequencies, and relative frequencies. The highest chart is on the left and the least in the right. This graph is
focused on the most important categories.
Pareto Principle (80/20 rule): Pareto stated that, typically, about 80% of the effects come from 20% of the possible
causes. Examples: “20% of the population owns about 80% of the wealth”. Let us see an example of Pareto Chart
application:
• Sample: Among the 1.100 visitors of the art exhibition Turner and the Masters (Prado Museum), those who
bought their tickets online accounted for 20,3%.
Fernando Alfayate Fernández – fernandoalfayate.apuntes@gmail.com
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller FernandoAlfayate. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for £4.32. You're not tied to anything after your purchase.