Predicting English Premier League winners

Manchester City, who won last year’s English Premier League (EPL), has a 36.5 per cent chance of coming top of this year’s season according to the University of Adelaide’s Professor Steve Begg. Southampton and Sheffield United are most likely to be relegated, with six teams fighting to avoid the third relegation place.

The 2019/20 EPL kicks off on Saturday 10 August 2019. An estimated 4.7 billion people watch the league each year. It is the most-watched sports league in the world, broadcast in 212 territories to 643 million homes.

Professor Steve Begg from the University’s Australian School of Petroleum has developed a way of predicting the probability of each team finishing in any given league position using uncertainty modelling. He uses this technique in his ‘day job’: researching how to make better decisions in the oil and gas industry.

“When the outcomes of decisions are influenced by things beyond our control, such as developing oil and gas resources, the uncertainty often makes the best choice very counter-intuitive. Unfortunately, we humans have not evolved a good capability to reason correctly when we are in uncertain or complex situations.

“In the Premier League uncertainty influences the many ways the season might turn out. What makes the outcome of the league so hard to predict is not just uncertainty in a team’s average ‘season form’, but their specific ‘match form’, which also includes random factors that can influence the outcome of individual games.”

Professor Begg has used ‘Monte Carlo simulation’ to predict the probabilities of how the 2019-2020 season will turn out. The technique was first developed in the Second World War by scientists working on the atomic bomb. It goes beyond many modern-day artificial intelligence and machine learning algorithms by generating the probabilities of many possible outcomes, not just a single ‘best’ prediction.

Based on performance for the last three years, and player transfers during the current off-season, he assesses the uncertainty in both ‘season form’ and ‘match form’ for each team. He uses this information to generate possible “match form” numbers for the teams involved in each game and bases the score on their relative values. 

Professor Begg has generated 100,000 simulations of how the whole season of 380 matches could play out – although there are 2,432 quadrillion ways the final positions could be arranged, 100,000 simulations are ‘more than enough’ to assess the probability of each team finishing in any given place. Each simulated season is constrained to be consistent with typical numbers of home wins, draws, losses, goals and points from the last 10 years of the EPL. He can run 10,000 different simulations per minute on his laptop computer.

Professor Begg intends to update the season and match form uncertainties periodically over the course of the year, to take account of how the teams have actually performed.

Liverpool’s most likely place to finish is also number one, but with a lower probability (28.1%) than Manchester City. Tottenham’s and Arsenal’s highest chances are to be in third (15.9% and 15.4%), with Manchester United’s top probability being fourth (14.6%). Chelsea’s drops to fifth (13.7%) with Everton perhaps catching the ‘big six’.

“As a dyed-in-the-wool Manchester City fan since 1967 I would love to see them win their fifth title. However, I have learned that emotions and intuition often lead to poor predictions and bad decisions. I can do better by minimising personal biases and relying on uncertainty models, grounded in actual data,” he says.

EPL 2020 Table 1: Probability of final position (prior to start of season). The table shows, for each team, the probability (in percentage) that they will finish in any given place in the league from 1 to 20. The darkest green colour shows the highest probability place and the darkest red the lowest probability. To read the figure, pick a team and then read the probabilities from left to right. The table on the left hand side indicates the team’s placing with the highest probability and also the chance of being relegated.

EPL 2020 Table 2: Game Day. The table shows one of the 100,000 simulations of how the season might evolve. The line for each team shows their league placing from the first week to the last – so the last column is the final placing. Early on, the random “match form” dominates but over time the consistent ‘season form’ starts to emerge, but still with an overlay of the uncertainty associated with the result of each match. For each team, each simulation uses a different ‘season form’ chosen to be consistent with the uncertainty for that team. The final positions for each team from the 100,000 simulations are used to generate the probabilities in Table 1.

Leave a Reply

Your email address will not be published. Required fields are marked *