Closed Book Exam
This is a closed book exam: No course materials (slides handouts, books, and papers) can be
used during the exam. One mark per question, except for the last question that carries 2 bonus
marks. The grading
1. Which is not a reason why data mining technologies are attracting significant attention nowadays?
A. There is too much data for manual analysis
B. Data are difficult to transfer from databases
C. Data can be a resource for competitive advantage
D. Machine learning algorithms are easily available
E. None of the Above
2. Regression is distinguished from classification by:
A. class probability estimation
B. numerical attributes
C. numerical target variable
D. hypothesis testing
E None of the above
3. Entropy
A. is a measure of information gain
B. is used to calculate information gain
C. is a measure of correlation between numeric variables
D. denotes the amount of chaos in the data
E. describes the amount of outliers in the data
4. Which of the following is not true about logistic regression?
A. Logistic regression can be used to predict the probability of membership in a certain class.
B. Logistic regression takes a categorical target variable in training data.
, C. A logistic regression represents the odds of class membership as a linear function of the
attributes.
D. Logistic regression requires numeric attributes and categorical attributes should be converted
to numeric attributes.
E. A logistic regression represents the odds of class membership as a nonlinear function of the
attributes.
5.An example of a supervised learning algorithm is
A. Statistical analysis
B. Neural network
C. Clustering techniques
D. Naïve Bayesian algorithm
E. None of the above
6. A fitting curve plots:
A. True positive rate vs. false positive rate
B. True positive rate vs. false negative rate
C. Generalization performance vs. size of training set
D. Generalization performance vs. model complexity
E. None of the above
7. When the causal relation between the input and the output variables is too complex, one would use:
A. Statistical modeling
B. Supervised learning
C. Unsupervised learning
D. All the above
E. None of the above
8. The variable marital status can be categorized using the codes (1) single, (2) married, and (3)
divorced. This is an example of a:
A. Ordinal variable
B. Nominal variable
C. Interval variable
D. Ratio variable
E. None of the above
9. Consider the following decision tree:
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through EFT, credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying this summary from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller khandapanda. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy this summary for R95,62. You're not tied to anything after your purchase.