Assignment-5 with Correct Answers Michigan Technological University MATH MA 5790
4 views 0 purchase
Course
MATH MA 5790
Institution
MATH MA 5790
Assignment 5
Raghavendran Shankar
1. The hepatic injury data set was described in the introductory chapter and
contains 281 unique compounds, each of which has been classified as causing no
liver damage, mild damage, or severe damage (Fig. 1.2). These compounds were
analyzed with 184 biologica...
assignment 5 raghavendran shankar 1 the hepatic injury data set was described in the introductory chapter and contains 281 unique compounds
each of which has been classified as causing no liver dama
Written for
MATH MA 5790
All documents for this subject (1)
Seller
Follow
ExamsConnoisseur
Reviews received
Content preview
Assignment 5
Raghavendran Shankar
1. The hepatic injury data set was described in the introductory chapter and
contains 281 unique compounds, each of which has been classified as causing no
liver damage, mild damage, or severe damage (Fig. 1.2). These compounds were
analyzed with 184 biological screens (i.e., experiments) to assess each
compound’s effect on a particular biologically relevant target in the body. The
larger the value of each of these predictors, the higher the activity of the
compound. In addition to biological screens, 192 chemical fingerprint predictors
were determined for these compounds. Each of these predictors represent a
substructure (i.e., an atom or combination of atoms within the compound) and
are either counts of the number of substructures or an indicator of presence or
absence of the particular substructure. The objective of this data set is to build a
predictive model for hepatic injury so that other compounds can be screened for
the likelihood of causing hepatic injury. Start R and use these commands to load
the data:
(a) Given the classification imbalance in hepatic injury status, describe how you
would create a training and testing set.
A: We use stratified random sampling to split the data to cope up with the imbalance in
hepatic injury status. Stratified random sampling is used to split the training and test data
in balance according to the hepatic status label (None, Mild, Severe) using
CreateDataPartition() method.
(b) Which classification statistic would you choose to optimize for this exercise and
why?
A: Accuracy is used as a classification statistic. Accuracy can be used to optimize as it
make good decisions to select optimal model for training and testing set. Accuracy tells
how good a classification model is functioning.
(c) Split the data into a training and a testing set, pre-process the data, and build models
described in this chapter for the biological predictors and separately for the chemical
fingerprint predictors. Which model has the best predictive ability for the biological
predictors and what is the optimal performance? Which model has the best predictive
ability for the chemical predictors and what is the optimal performance? Based on
these results, which set of predictors contains the most information about hepatic
toxicity?
A:
Biological Data:
GLM:
,PLSDA:
,
, LDA:
GLMNET:
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller ExamsConnoisseur. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $8.49. You're not tied to anything after your purchase.