Laura van den End
Introduction to python
General Example: calculate BMI
• The symbol for exponentiation is ** # Import numpy
• Call an index from a list or array using [ ] import numpy as np
• When using [… : … ] to make a subset, the last
index is not included # Create array from height_in with metric units:
np_height_m
np_height_m = np.array(height_in) * 0.0254
List methods
List have methods = built in functions
# Create array from weight_lb with metric units:
• Index: gives you the index of a certain number np_weight_kg
#Print out the index of the element 20.0:
np_weight_lb = np.array(weight_lb)
print(areas.index(20.0))
np_weight_kg = np_weight_lb * 0.453592
Areas is the list name in this case
• Count: gives you the occurrence of an element # Calculate the BMI: bmi
#Print out how often 9.50 appears in areas
bmi = np_weight_kg / np_height_m**2
print(areas.count(9.50))
Areas is the list name in this case
# Print out bmi
• Append: adds an element to the list it is called on print (bmi)
areas.append(24.5)
• Remove: removes the first element of a list that SIDE EFFECTS
matches the input
• Reverse: reverses the order of the elements in the • Numpy arrays cannot contain elements with
different types. If you try to build such a list, some
list it is called on.
of the elements' types are changed to end up with
areas.reverse()
a homogeneous list = type coercion.
• The typical arithmetic operators, such as +, -, * and
Packages / have a different meaning for regular Python lists
When installed: import them using and numpy arrays.
import numpy
import numpy as np 2D ARRAYS
When using a function from that package, always You can have multi-dimensional arrays:
use the name of the package 2d_array= np.array([… , … , …] , [… , … , …])
numpy.array() • This is how rows and three columns: retrieved by
np.array() 2d_array.shape. The output is (2, 3))
• You can select one row by using [ ]. You can also
Math package select a specific index from that row by using a
• Pi function: pi (π) second set of [ ]: 2d_array [0] [2] or 2d_array [0, 2]
math.pi my_array[rows,colums]
• Radians function: convert degrees into radians
math.radians(degrees) OTHER FUNCTIONS
• Mean function: to get the average
NumPy • Median function: to get the middle value when
• Array function: same as a list but you can preform sorted small to big
calculations on arrays, not on lists. • Corrcoef function: to check correlation between
array([…]) height and weight.
bmi > 23 creates an array of booleans where it np.corrcoef(np_city[: , 0], np_city[: , 1])
will be “false” if the vmi is smaller than 23 and true if • Std: calculate standard deviation
its above 23 • Column.stack to make one array
bmi [bmi > 23] creates an array, only np.column.stack((height, weight))
containing the values that were above 23
Evolutiebiologie
 
, Laura van den End
Example soccer
• Convert heights and positions, which are regular
lists, to numpy arrays. Call them np_heights and
np_positions.
• Extract all the heights of the goalkeepers. You can
use a little trick here: use np_positions == 'GK' as
an index for np_heights. Assign the result to
gk_heights.
• Extract all the heights of all the other players. This
time use np_positions != 'GK' as an index for
np_heights. Assign the result to other_heights.
# Convert positions and heights to numpy arrays:
np_positions, np_heights
np_positions = np.array(positions)
np_heights = np.array(heights)
# Heights of the goalkeepers: gk_heights
gk_heights = np_heights [np_positions == 'GK']
# Heights of the other players: other_heights
other_heights = np_heights [np_positions !='GK']
# Print out the median height of goalkeepers.
print("Median height of goalkeepers: " +
str(np.median(gk_heights)))
# Print out the median height of other players.
print("Median height of other players: " +
str(np.median(other_heights)))
Evolutiebiologie
 
, Laura van den End
Intermediate Python
Data visualisation Dictionary of dictionaries
Use sub package ptplot from matplotlib → imported europe = { 'spain': { 'capital':'madrid', 'population':46.77 },
as plt. 'france': { 'capital':'paris', 'population':66.03 },
'germany': { 'capital':'berlin', 'population':80.62 },
'norway': { 'capital':'oslo', 'population':5.084 } }
• Line plot Create a dictionary, named data, with the keys
plt.plot (x, y)
'capital' and 'population'. Set them to 'rome' and
• Scatter plot
59.83, respectively.
plt.scatter (x, y)
data = {'capital':'rome', ‘population':59.83}
• Histogram Add data to europe under key 'italy'
plt.hist(data, bins = nr)
europe[‘italy']=data
• To show the plot:
plt.show() Pandas
Make tables → dataframe
• Change the x-axis in a logarithmic scale
plt.xscale(‘log’) Rows and columns have labels and there are
multiple types
• Clean the plot
plt.clf()
Manually
Customization dict = {
“country”:[‘Brazil”, “Russia”, “India”],
• Add labels
“capital”:[“Brasilia”, “Moscow”, “New Delhi”],
plt.xlabel(‘label’)
“area”:[8.516, 17.10, 3.286],
plt.ylabel(‘label’) “population”:[200.4, 143.5, 1252] }
• Add a title Import pandas as pd
plt.title(‘title’) brics = pd.DataFrame(dict)
• Change y-axis
plt.yticks([0,2,4,6,8,10], [“names of the ticks”]) Change the labels of the rows
• Add more data brics.index = [“BR”, “RU”, “IN”]
year = [1800, 1850, 1900] + year
pop = [1.0, 1.262, 1.650] + year
• Add text
Import from external le
plt.text(1550, 71, ‘India') Brics = pd.read_csv(“path/to/brics.csv”, index_col =
0)
• Add grid
plt.grid(True)
Select one column as a dataframe:
• Add color using dictionaries
brics[[“country”]]
Instead of 2 separate lists with countries and
populations: Select rows as a dataframe:
world = {‘Afghanistan’:30.55, ‘Albania’:2.77, brics[1:4]
‘Algeria’:29.21} brics.loc[[“RU”]]
Key:value → key opens the door to value: brics.loc[[“RU”, “IN”, “CH”]]
world[“albania”] gives 2.77
Add elements to dictionary (can also be used to Combined: (iloc can be used in combination with
update values) index numbers)
world[“sealand”] = 0.000027 brics.loc[[“RU”, “IN”, “CH”], [“country”,
Delete elements to dictionary “capital”]]
del(world[“sealand”]) brics.loc[: , [“country”, “capital”]]
Evolutiebiologie
 
fi