Assignment 5: Machine Learning (Weka)
➢ Train a tree on the data from the golf playing example Download golf playing
example mentioned in the slides to make sure you can reconstruct the
decision tree from the slides (don't forget to select that you're loading a csv
file). Pick the J48 algorithm which is a slight extension of ID3. Select the “use
training set” test option. Visualise the tree to verify it's the same as on the
slides.
○ Make sure you understand the confusion matrix given by Weka.
■ I will add a ss of the confusion matrix and maybe you can add a
little paragraph discussing it and explain it, to show that you
‘understand’ it.
=== Summary ===
Correctly Classified Instances 14 100 %
Incorrectly Classified Instances 0 0 %
Kappa statistic 1
Mean absolute error 0
Root mean squared error 0
Relative absolute error 0 %
Root relative squared error 0 %
Total Number of Instances 14
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class
1,000 0,000 1,000 1,000 1,000 1,000 1,000 1,000 No
A confusion matrix is used in classification problems to evaluate the effectiveness of a machine
learning model. It compares the predicted classifications to the actual classifications, allowing us to
see how well the model is performing.
➢ Accuracy: The proportion of correctly classified instances out of the total.
(5+9) / (5+0+0+9) = 14/14 = 100%.
➢ Precision: The proportion of true positives out of the total predicted positives.
For class "Yes", it is 9 / (9+0) = 100%.
➢ Recall (Sensitivity or True Positive Rate): The proportion of true positives out of the actual
positives.
For class "Yes", it is 9 / (9+0) = 100%.
➢ Specificity (True Negative Rate): The proportion of true negatives out of the actual negatives.
For class "No", it is 5 / (5+0) = 100%.
The confusion matrix shows that the classifier has performed perfectly on the given dataset,
correctly classifying all instances without any errors.
➢ Use the medical heart example data Download medical heart example data to
train a decision tree. Again, visualise the tree.
Material to hand in:
○ A screenshot of your decision tree.
, === Run information ===
Scheme: weka.classifiers.trees.J48 -C 0.25 -M 2
Relation: heart
Instances: 918
Attributes: 12
Age
Sex
ChestPainType
RestingBP
Cholesterol
FastingBS
RestingECG
MaxHR
ExerciseAngina
Oldpeak
ST_Slope
HeartDisease
Test mode: evaluate on training data
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller famkebrieffies. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $7.77. You're not tied to anything after your purchase.