<DONGBEEN YUN> pf(1.96, df1 = 7, df2 = 8)
read.csv("filesname.csv", stringsAsFactors = TRUE) IF critical value à qf(q, df1 = ?, df2 = ?)
my_dataframe <- read.csv("filesname.csv", stringsAsFactors = TRUE) Becareful some case use lower.tail = FALSE when [ X >,>= q ]
Q10. Exponential Distribution with arrival rate ⋋ = 17. What is this value if
Q1. [Sample (median,max,min)] what is the sample median of the variable the probability of outcomes below this value equals 0.49?
price for that part of the sample for which the variable shops takes the à pexp(0.49, rate = 17)
value Far. IF critical value qexp(q, rate = ?)
à my_sub_dataframe <- subset(my_dataframe, shops == “Far”) Becareful some case use lower.tail = FALSE when [ X >,>= q
median(my_sub_dataframe$price, na.rm = TRUE) Q11. Chi-squared distribution with 12 degrees of freedom. What is the pro
median can replace à min,max etc. Don’t forget put na.rm = TRUE bability that you observe an outcome smaller than 9.44.
Q2. What is the number of observations for which view is at least 10 and pchisq(9.44, df = 12)
at most 16 (so bounds included)? Do not count missing values. IF critical value à qexp(q, df = ?)
àmy_variable <- my_dataframes$views Becareful some case use lower.tail = FALSE when [ X >,>= q ]
length( which( (my_variable >= 10) & (my_variable <= 16) ) ) Q12. (Construction a confidence interval) The variable price, you want to Construc
if first observations value start 1 to 10 tion a confidence interval for the mean of price at a confidence level of
Q3. Make bar chart of the variable type. Make sure that use the color 0.8. What is the Upper boundary of this confidence interval?
grey58 for the bars, and that your x-label is type, and that the plot has à t.test(my_dataframe$price, conf.level = 0.8)
no title. (if you obtain the confidence interval use t.test)
barplot(table(my_dataframe$type), col = “grey58”, xlab = “type”, main = “”) Q13. Variable shops, you want to construct a confidence interval for the pr
(IF : y-label -> ylab = “object”, want to make horizontal-> horiz=TRUE oportion 𝜋 in the population for which shops=Far. You use confidence level of 0.95.
) What is the UPPER bound of this confidence interval if you use the z-statistic
Also if two variable for ex) approach to compute it.
->Q3.1Make a stacked barplot of the categorical variable type(vertical) again à zcrit = qnorm(0.975)
st highway(horizontal). Ensure that your x-label is highway, your y-label is my_subdataframe <- subset(my_dataframe, !is.na(shops))
type, and that the plot has no title. n <- nrow(my_subdataframe)
àplot(my_dataframe$highway, my_dataframe$type, main = “”, xlab = “highw k <-length( which (my_subdataframe$shops == variable_value))
ay”, ylab = “type”) p = n/k
Histogram also same à hist(my_dataframe$highway, my_dataframe$type, ma p + zcrit *sqrt( p * (1-p) / n )
in = “”, xlab = “highway”, ylab = “type”) (if you cant get result,
1.ànrow(my_dataframe)
Q4. Binomially distribute with probability 𝜋 =? and number of trials 𝔫 =? 2.àsummary(my_dataframe) – find k sample
-if random variable is exactly equal to P[ X = number] 3.àand then switch likes p = k / n ( but put directly number likes p = 560 /1702)
à dbinom (number, size = n, prob = 𝜋) Q14. variable nrbids, you want to construct a confidence interval for the variance
-if random variable is equal and greater P[ X ≥ number] 𝜎 ! in the population, You use a confidence level of 0.8
àpbinom(number-1, size = n, prob = 𝜋 , lower.tail = FALSE) What is the UPPER bound of this confidence interval?
-if random variable is just greater P[ X > number] àzcrit <- qnorm(0.975)
àpbinom(number, size = n, prob = 𝜋 , lower.tail = FALSE) my_subdataframe <- subset(my_dataframe, !is.na(nrbids))
-if random variable is equal and smaller P[ X ≤ number] sigma2 <- var(my_subdataframe$nrbids)
àpbinom(number, size = n, prob = 𝜋) n <- nrow(my_subdataframe)
-if random variable is just smaller P[ X < number] chicrit_low <- qchisq(0.2/2, df= n-1)
àpbinom(number-1, size = n, prob = 𝜋 ) (n-1) * sigma2 / chicrit_low
<Remind> this one is only apply for Discrete distribution)
Discrete distribution : Binomial Distribution: (qbinom, dbinom, pbinom) Q15. Variable views, you want to perform a right sided t-test to see whether the
Poisson Distribution: (ppois,qpois,dpois) mean is greater than 14.2.What is the p-value of your test statistic?(not really need
Hypergeometric Distribution: (phyper, qhyper, dhyper) conf.level)
##Qnorm-threshold à t.test(my_dataframe$views, alternative = “greater”, mu = 14.2, conf.level = 0.8)
Ex) you observe a normal random variable with mean -20 and standard de Q15.1Variable prevprice, you want to test Η" : 𝜇 ≥ 256853.4 using a standard t-
viation 20. U are interested in a threshold value such that the probability test. What is the p-value of your statistic?
is 0.19 that your random variable lies above this threshold. What is the val à t.test(my_dataframe$prevprice, alternative = “less”, mu = 256853.4, conf.level =
ue of this threshold? 0.95)
à qnorm(0.19, mean = -20, sd = 20, lower.tail = FALSE) Q16. The variable prevprice, you want to test Η" ∶ 𝜇# −𝜇! using a 95% confidence
Q5. (Hypergeometric Distribution)You take sample (without replacement) of interval. Here, 𝜇# is the prevprice mean of the population part for which the
size 34 from a batch of 99 faulty and 89 correct product. What is the pro variable type has the value Apartment, whereas 𝜇! corresponds to the prevprice
bability of observing strictly more than 21 faulty products? mean in the rest of population.
( batch = m , correct thing = n , size = k ) You cannot assume that the variances for the two sub- populations above are equal.
àphyper(21, m = 99, n = 89, k = 34, lower.tail = FALSE) First throw away any observations for which either prevprice or type is missing.
Strictly more à P[ X > (faulty variable x) ] What is the UPPER boundary of the confidence interval for 𝜇# −𝜇! ?
Strictly less à P[ X < (faulty variable x) ] à “RFC”: my_selection1 <- subset(my_dataframe, !is.na(type) )
qhyper(if critival value), my_selection2 <- subset(my_selection1, !is.na(prevprice))
dhyper<-(if = P[ X = (faulty variable x) ] my_index<-which(my_selection2$type == “Apartment”)
qhyper,dhyper,phper is Discrete distribution so becareful! See Remind my_sample1<-my_selection2$prevprice[my_index]
Q6. Poisson Distribution with mean ⋋ = ?, with random variable = q my_sampe2<-my_selection2$prevprice[-my_index]
ppois(q, lambda = ? )
if critical value à qpois(q, lambda = ?) t.test(my_sample1, my_sample2, conf.level = 0.95, var.equal = FALSE)
if random variable is exaxtly same to ? àdpois(q, lambda = ?) (if can assume “put”-> var.equal = TRUE)
Becareful some case use lower.tail = FALSE Q16.1 variable nrbids, you want to test Η" ∶ 𝜇# −𝜇! = −0.5using a t-test. Here 𝜇# is
This is Discrete distribution so becareful look at the Remind! the nrbids mean of the population part for which the variable type has the value
Q7. Normal Distribution with mean = M , sd = (var)^2 or just number wit House, whereas 𝜇! corresponds to the nrbids mean in the rest of the population.
h random variable = q You can not assume that the variances for the two sub-populations above are equal.
àpnorm(q, mean = M , sd = SD) First throw away any observations for which either nrbids or type is missing.
if critical value qnorm(q, mean = M, sd = SD) what is the p-value of your test statistic?
Becareful some case use lower.tail = FALSE when [ X >,>= q ] Put “RFC” and then
Q8. Student t distribution with 13 degrees of freedom, what is this value i
à t.test(my_sample1, my_sample2, alternative = “two.sided”, mu = -0.5, var.equal
f the probability of outcomes below this value equals 0.13?
= FALSE)
pt(0.13, df = 13 )
(If can assume variances for the two subpopulation above equal then
IF critical value à qt(q, df1 = ?)
var.equal=TRUE) and (if Η" ∶ 𝜇# −𝜇! ≥ (𝑛𝑢𝑚𝑏𝑒𝑟) using alternative = “less” , or
Becareful some case use lower.tail = FALSE when [ X >,>= q ]
𝜇# −𝜇! ≤ (𝑛𝑢𝑚𝑏𝑒𝑟) using alternative = “greater”.)
Q9. F distribution with 7 and 8 degrees of freedom, respectively. What is t
he probability that you observe an outcome strictly less than 1.96