100% tevredenheidsgarantie Direct beschikbaar na betaling Zowel online als in PDF Je zit nergens aan vast
logo-home
Summary Machine Learning - PYTHON PART (complete walkthrough) €6,49   In winkelwagen

Samenvatting

Summary Machine Learning - PYTHON PART (complete walkthrough)

 252 keer bekeken  15 keer verkocht

Complete guide through all notebooks. Each type of exercise clearly explained.

Voorbeeld 4 van de 46  pagina's

  • 3 juni 2020
  • 46
  • 2019/2020
  • Samenvatting
Alle documenten voor dit vak (14)
avatar-seller
jeroenverboom
Python for Machine Learning
Huge potential helper: https://ml-cheatsheet.readthedocs.io/en/latest/loss_functions.html


Notebook 1: Evaluation
MAE & MSE
We want to calculate the MAE and the MSE for the evaluation of the model.

MAE: to calculate the absolute error, we need to do two steps
1. Transform the predicted values and the actual values to arrays
2. Transform the values to absolute values &calculate the difference between those the
two
3. Take the mean of the absolute error
In code that looks something like this:
def MAE(pred, actual):
abs_error = abs(np.array(actual)- np.array(pred))
mae = sum(abs_error) / len(actual)
return mae

MSE: we do the exact same, except now we use the exponential of abs_error.
def MAE(pred, actual):
sq_error = np.exp(abs(np.array(actual)- np.array(pred)))
mae = sum(abs_error) / len(actual)
return mae



Binary classification
In this exercise we need to calculate the accuracy of a spam filter. This spam filter classifies
between spam or non-spam. To calculate the accuracy, we need to see how many times the
filter was (not) correct. The trick is to check when the prediction is equal to the actual value.
We can take two routes:
A) For loop:
The steps we need to undertake are:
1. Make a range of the length of the dataset
2. Iterate over each element in the dataset and check if ypred == yactual. Count += 1 if
True
3. Divide the count by the total amount of predictions.
def accuracy(y_true, y_pred):
count = 0
for i in range(0, len(y_true)):
if y_true[i] == y_pred[i]:
count += 1
return count/len(y_true)
B) NumPy
Steps:

, 1. We transform both the pred and the actual into arrays
2. We create an object that compares ypred and y_actual. The output is an array that
contains True, True, Ture, False, True, False, False etc..
3. Because booleans can be seen as 0 and 1, we can use the np.mean() to
get the average rate that ypred == yactual → this rate is equal to the
accuracy.

def accuracy_np(y_true, y_pred):
acc2 = np.mean(np.array(y_true) == np.array(y_pred))
return acc2

Building a confusion Matrix
If we want to calculate the Recall and Precision, we will need a confusion matrix. We start off
with making an empty matrix and we are going to fill this matrix with values. This works for
both binary and multi-classification matrices.
1. First, we check how many unique classes the list has. The function np.unique
collects all unique values in an array. The len() function turns this into an int.
2. We make an empty matrix using the np.zeros() function, with N x N as their shape
3. We use a for loop to iterate over two zipped lists: ypred & yactual.
4. We use the values in each iteration step to index the position in the matrix, and we
add 1 to that position.

N = len(np.unique(y_true)
def confusion_matrix(y_true, y_pred):
M = np.zeros((N, N))
for i, j in zip(y_true, y_pred):
M[i, j] += 1
return M

Def precision(M):
TP = M[1, 1]
FP = M[0, 1]
return TP/(TP+FP)



You can also se set() operations. Check notebook 1 exercise 7 for this.

,Notebook 2: Decision Trees
Decision trees have a recursive structure: If condition A holds, then move on to the following
check. The example below shows how recursive functions work in Python. Essentially, you
call the function within the function, however, the input of different than the first call. This is
an example of a recursive function calculating the factorial:

def factorial(n):
if n == 0:
print("This I know! (the base case)")
return 1
else:
print("I don't know the factorial for", n, "let's try", n-1)
return n * factorial(n-1)
factorial(5)

In the if-statement you define the base case. This is relevant because it will keep on calling
itself until it reaches the base case. Under the hood python stores the number of times it
called itself. When it reaches the base case, it can trace back what values it should use for
the ‘n-1’.

Example 2:

def rec_sum(a):
if len(a) == 1:
return a[0]
else:
return a[0] + rec_sum(a[1:])

rec_sum([1,2,3,4,5,6])

Example 3: We need to count the number of brackets in this nested lists:
nested = [[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[13]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
To do so we must use a function to check whether its content is a list or not. It keeps on
doing this until its content is an integer:

def search(a, depth=0):
if isinstance(a, list):
return search(a[0], depth + 1)
else:
return depth
the a[0] returns the 0th element of the list. --> therefore remove on pair of brackets. If you do
this recursively, it will operate: a[0][0], a[0][0][0] and so on.. Until the content is not a list
anymore, because 13 is an int. Meanwhile, for each recursion the depth is increased by +1.

, Recursion in decision trees
Recursive functions are very useful when dealing with tree structures, which are recursive
structures themselves. We do not know how deep the tree is. All we can see is if the node we are
currently looking at has any children, and if it does we can try to visit those, and repeat this.
Decision trees are usually full binary trees which means that every node has either 0 or 2
children. If it has 0 then it is a leaf node.

We start off by creating a function with which we can call a node:

def Node(left=None, right=None, feature=None, value=None, predict=None):
"Return a node in a binary decision tree"
return dict(left=left, right=right, feature=feature, value=value, predict=predict)

def isLeaf(node):
"""Helper function to check if the current node is a leaf"""
return node['left'] is None and node['right'] is None

Now that we have specified the function for an empty decision tree, we can start giving it
content:

# We want to first ask about value Round in column at index 2.
root = Node(feature=2, value="Round",

# If false, in the left branch, which is a leaf node, we'll predict Banana
left=Node(predict="Banana"),

# If true, in the right branch we'll ask about the color Red
right=Node(feature=1, value="Red",

# Based on the answer to question about color Red,
# we'll predict either Lime
left=Node(predict="Lime"),

# or Apple
right=Node(predict="Apple")))

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-klanten hebben meer dan 700.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper jeroenverboom. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor €6,49. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews)

Afgelopen 30 dagen zijn er 62555 samenvattingen verkocht

Opgericht in 2010, al 14 jaar dé plek om samenvattingen te kopen

Start met verkopen
€6,49  15x  verkocht
  • (0)
  Kopen