Lecture 1:
Why use statistics:
- Cost-effective way to extract the maximum information from experiments to take decisions
- Objective way to guarantee the quality of analytical laboratory
- Providing solutions dealing with the enormous amount of data generated by instrumentation
- Heavy impact in the ‘-omics’ application areas
- Univariate method: measure one variable per object
- Multivariate method: measure more than one variable per object
- Sample: part of the population we study (e.g. 5 bottles)
- Object: individual of this sample (1 bottle)
- Population: e.g. all possible bottles
Variables:
Classification Description Examples
Nominal When variables can be described only with words, but Colors
cannot be ranked
Ordinal When variable can be described only with words, but can Good, average,
be ranked poor
Interval When variables can be described by a number pH, concentration
Order of instruments:
- 0th: number (pH meter)
- 1th: vector (UV-VIS)
- 2nd: matrix (LC-MS)
- 3rd: cube (GCxGC-MS) very complex!
Errors:
- Total error: ei = xi - µ0
- xi : single measured value
- µ0 : true value
- ei = (xi - µ) random + (µ - µ0) systematic
- µ: mean of measurements if we do them infinite times
Random errors:
- Affect precision – repeatability or reproducibility
- Cause replicate results to fall on either side of a mean value
- Can be estimated using replicate measurements
- Can be minimized by good technique, but no eliminated
- Caused by both operator and instrument
Systematic errors:
- Produce bias – an overall deviation of a result from the true value even when random errors
are very small
- Cause all results to be affected in one sense only, all too high or too low
- Cannot be detected simply by using replicate measurements
- Can be corrected, e.g. by using standard methods and materials
- Caused by both operator and instrument
,Definitions:
- Precision: the closeness of agreement between the results obtained by applying the
experimental procedure several times under prescribed conditions random error:
normally measured as standard deviation
- Bias: difference between the value obtained after infinite measurements and the true value
trueness, systematic error
- Accuracy: difference between the result of a determination and the true value total error
- Repeatability: magnitude of differences when measurements are performed under same
conditions (same lab, operator, instrument, day, etc.)
- Reproducibility: magnitude of differences when measurements are performed under
different conditions (different lab, operator, instrument, day, etc.)
- Method bias: systematic error attributed to an effect intrinsic to the method systematic
- Laboratory bias: systematic error attributed to an effect intrinsic to the laboratory random
Lecture 2:
- Parametric: assume that data follows certain distribution
- Non-parametric: assume that data does not follows certain distribution
Data analysis:
- Descriptive statistics: summarize sample (describe it)
- Inferential statistics: describe the population from where the sample comes from
- Modelling: establish a relationship between two (or more) variables
Standard deviation:
- Sample: s (Latin alphabet)
- Population: (Greek)
Type of data:
- Non-robust: normally distributed
- Robust: non-normal (e.g. Skewed distribution), data with outliers
Central tendency measurement:
- Mean x́ : sum of all measurements divided by the number of measurements
- Median ~ x : value separation the higher half from the lower half of the measurements
middle value
- Standard deviation s: measurements quantifies the amount of variation or dispersion
- Variance s2: square of standard deviation
- Relative standard deviation (RSD) and coefficient of variation (CV): relative errors
Normal distribution:
- Integrate distribution function between two boundaries
- 68% of population values lie within 1 of the mean
- 95% of population values lie within 2 of the mean
- 0th moment: area
- 1th moment: mean (µ)
- 2th moment: variance (2)
, Standardized normal variable: z=
x−µ
❑
Matlab: cdf(‘Normal’ , x , µ , ) cumulative distribution function
icdf(‘Normal’ , p , µ , )
Confidence limits:
- Means are always normally distributed
- No bias (bias is known)
- Random errors will always occur sample mean is never equal to true value
❑
- Standard error of mean (SEM) =
√n
- Critical value always z = (-)1.96 95% of all values will fall within -1.96 – 1.96
- Problem: we know s, not
Students t-test:
- Not know , but know s can be calculated now
x−µ
- t=
s/√n
- Depends on confident interval (e.g. 99%)
- Degrees of freedom: v = n – 1 infinite number of samples (and degrees of freedom) it will
look like normal distribution more sure to compute s critical value moves to 1.96
If variances are homogeneous variances from different experiments can be pooled
Lecture 3:
isnan(x) checks if it is a number or not 1:not a number, 0: a number
mean(x,2,omitnan) calculate the average of matrix x in the second dimension (from left to right)
and ignore the ‘no numbers’
Hypothesis testing:
- Example:
- Null hypothesis: the method works (µ = µ0)
- Alternative hypothesis: the method does not work
- If the p-value is high, we assume that the method works:
p : accept H0
- Example: Is the alcohol level of the drive above the maximum level:
- H0: µ ≤ µ0 H1: µ > µ0
- Example: Are these 5 instruments giving the same value?
- H0: µ1 = µ2 = µ3 = µ4 = µ5 H1: µ1 µ2 etc.
When H0 is true:
Probability of wrongly rejecting H0 equals type-I error (false positive)
Correctly accepting H0 equals 1 – is specificity of the test (true negative)
When H1 is true:
Probability of wrongly accepting H0 equals beta type II error (false negative)
Probability of correctly accepting H1 equals 1 – beta sensitivity of power of test (true positive)
P-value:
- The probability of obtaining the data obtained supposing that the null hypothesis is true
- Not the probability that the null hypothesis is true given the data
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller analyticalsciences. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $4.84. You're not tied to anything after your purchase.