Practical 2: Advanced Data Analysis: full summary + explanations
86 views 9 purchases
Course
Advanced Data Analysis (2052FBDBMW)
Institution
Universiteit Antwerpen (UA)
A detailed description of each step of practical 2, including screenshots and explanations of the steps taken in R. This document includes everything that is seen in class, and all the exercises that are needed to be made for practical 2 of Advanced Data Analysis of the first master of Biomedical S...
Practical 2 : Statistical analysis is R
Before you start :
- Create a folder / directory on your PC where you put all files related to this practical
- Turn this folder into your working directory
setwd ( "C:/wherever your files are")
1. Demo : the independent sample t-test
Data in long format : ozoneLong.txt
Read in the dataset ozoneLong.txt. This small dataset contains measurements of ozone (O3) in
two gardens, labelled A and B.
White format: concentrations in the gardens in two separate columns: no paired data. Independent
observations.
In the long format, there is the same data but arranged differently: one column which tells you which
garden and one column which gives the value of the ozone.
Here we want to know if there is a difference in ozone concentrations between the 2 gardens. The
null hypothesis is there is no difference in ozone between the gardens.
- Parametric tests: e.g. T-test
o Only allowed if you have enough observations and the data has +- a normal
distribution
- If the conditions for parametric testing are not fulfilled, you do Mann-Withney U test
Use the formula interface: t.test(continuantVar~groupingVar,data=myData)
Comes back quite often: this is usually the first argument of some function about a statistical test.
Generally it is y~x with y being the dependent variable (in t.test= continuous variable, here ozone),
and x that defines the groups (which group an observation comes from). The third argument is the
data frame.
First explore the data :
- How are the data organized? Check the environment in RStudio, or the str() function.
What do the two variables mean?
- Generate a plot to visualize the ozone concentration in the two gardens
Plot(myData$garden,myData$ozone)
Gives automatically a boxplot.
Looks like the ozone concentrations are higher in garden B than in garden A.
1
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller Bi0med. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $5.92. You're not tied to anything after your purchase.