Results Prediction of English Premier Leauge (EPL) using Machine Learning Algorithms
6 views 0 purchase
Course
FC6P01 PROJECT (FC6P01)
Institution
London Metropolitan University (LMU)
The purpose of this paper is to predict the results of season 2020/21 for English Premier League. To predict the results, I used Logistic Regression, Support Vector Machines and K- nearest neighbors. An exploratory data analysis was also conducted to explore more insights on the datasets. I also ev...
Module Code & Module Title
FC7P01NI Dissertation
Msc Project
Topic: Result Prediction of season 2020/21 for English
Premier League using Machine Learning Algorithm
,FC7P01NI MSc Project
Abstract
This dissertation is written as a part of the MSc program in Data Science at London Metropolitan
University. The purpose of this paper is to predict the results of season 2020/21 for English Premier
League. Several research has been done on results prediction of English Premier League of
previous seasons.However, no significant work has been done on results prediction of season
2020/21. To predict theresults, I used Logistic Regression, Support Vector Machines and K-
nearest neighbors. An exploratory data analysis was also conducted to explore more insights
on the datasets. I alsoevaluated the percentage of whether the match will be won by home team,
away team or will it bedraw. My results determine the use of logistic regression for the prediction of
English Premier Leaguematches. My results also determine Manchester City have more chance to
win the champions league title for the season 2020/21.
Keywords: machine learning, logistic regression, support vector machines, K- nearest
neighbors, multi class classification, sports results prediction.
3
,FC7P01NI MSc Project
Chapter 1: Introduction
1.1 Background
Football, sometimes referred to as soccer, is one of the most popular sports in the world. It is played,
watched, and enjoyed by billions of people worldwide. According to the statistics published by
TopTrend, “there are 3.5 billion football fans across Europe, Asia, Africa and America”
(TopTrendSports, 2014). It has been ranked as the world’s most popular sport.
As of 2018, 80 percent of the people from UAE declared themselves either interested or very
interested in football (Oregonreigns, 2021).
There are many matches that are played in football. One of the most popular is the English Premier
League (Pilger, 2014). The English Premier League is the most watched league on the planet with
one billion fans spread across 188 countries (Leauge, 2018). The league held its first season in 1992-
93. It was composed of twenty- two clubs for that season (Leauge, 2018). The number of clubs was
reduced to twenty in 1995 (France, 2020). At present, there are twenty clubs playing in EPL matches.
With the advancement of technology, analytics have been popular in the recent years. As stated by
an article by fingent, “Analytics have completely disrupted the way organizations go about with their
business by using one commodity that is data.” (Fingent, 2020). Analytics have been used in sports
too. While the theory of sports analytics might have been around since the 1980s, it was hugely
popularized by Billy Beane, the general manager of American baseball team (Phillips, 2020).
“Written by Michael Lewis, ‘Moneyball: The Art of Winning an Unfair game’ was the first book on
sports analytics (Soccerment Research, 2019). Analysis of matches in football have been increased
over the years. “Match Analysis in football has really come to prominence over the last 5 years. The
primary reason being the accessibility of football data and the growing analytics community behind
it.” (Eccles, 2017).
In football too, clubs like to gain a competitive edge on and off the pitch, and big data is allowing them
to extract the insights (Business, 2018). These insights have helped in improving the player’s average
stats such as the number of goals they score, the number of fouls they commit, with how many red
and yellow cards are booked, and many more.
According to an article by Intel, “Philippe Coutinho scored a free kick against Barcelona in December
2017 by firing his shot underneath the jumping defensive wall, Jurge Klopp credited his analytics for
pointing out the opportunity” (BeSoccer, 2018).
8
, FC7P01NI MSc Project
In 2014, Bing correctly predicted the outcomes for all the fifteen games in the knockout round for the
2014 world cup (Nisen, 2018). Every single game had an accuracy of 100 per-cent. As suggested
by Guardian, “Manchester United, a team of English Premier League, have won against Swansea,
through short and long passes” (Jackson, 2018).
Kaggle, an online community of data scientists and machine learning hosts a yearly competition
called “March Madness”, where many data scientists gather and predict winners and losers of a game
(Kaggle, 2021).
1.2 Goal of the project
With so much money and emotion invested in the outcomes of professional matches in English
premier league, the same sort of urban myths and armchair managers arise in English Premier
League. Having been a fan of Liverpool for years, I am excited to bring the power and insight of
machine learning to bear on English Premier League matches.
One of our goals in this thesis will be to predict the winning team. We will use machine learning
techniques to predict full time results or FTR.
We will predict FTR by evaluating the three possible outcomes. They are: home team win, draw and
away team win. Because of the nature of the outcome, predicting FTR can be categorized as a
multiclass classification problem.
Since it is a multiclass classification problem, we analyzed different machine learning techniques
such as logistic regression, support vector machines and k-nearest neighbors.
I then divided the data into two parts: feature set and target variable. All the columns except FTR is
feature that is considered for training the model.
1.3 Motivation
Data Science is often used in football for evaluation of a team’s performance and the use of that
information to predict the result (Bouley, 2020). Prediction in football matches is often difficult to
predict, as there are many factors that influence the outcome of the game (Punter, 2017). The
possible outcomes of a football match are win, lose or draw. It can therefore seem quite
straightforward to predict the outcome of a game. However, from 1992 to 2017, the average goals
scored in EPL per game was less than 3 goals (Wright, 2021).
A potential solution to this problem is to explore the in-game statistics to dive deeper than the simple
match results. In the last few years, in- depth match statistics have been made available (Herbinet,
2018). Due to this, expected goals metrics have been developed in which the estimate of number of
9
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller Erickgoose. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $20.88. You're not tied to anything after your purchase.