The Influence of Luck in Scoring Streaks

The human brain craves identifying patterns and apparent order, with sporting contests being a rich source of data for this universal pastime.

A tendency for teams to beat other teams at football, the bounce back ability of some teams following a disappointing setback, such as was mentioned following Spurs’ defeat to Juventus last week or a player’s copious scoring record against a particular opponent.

All of these types of streaks are easily pinpointed over the well documented history of the Premier League.

The next step, once these repetitive events have been identified is often to extrapolate the trends into the future in the form of a prediction the next time the combatants align.

Unfortunately, this seemingly reasonable approach to foretelling the future is rife with bias and faulty logic.

From the very start, selectively dicing the data may omit information that either fails to support or actively opposes the ultimate conclusion.

That Team A has beaten Team B in each of their last four meetings immediately tells the sceptic that the fifth previous meeting wasn’t as successful.

Four matches, most likely stretching over a couple of seasons is also a small amount of information on which to base a conclusion, when there is likely to be 140 other games that took place involving one or other of the teams over the relevant timeframe.

Sporting contests are decided by a mixture of high levels of skill and also varying levels of luck.

This can be luck in the traditional form of good fortune, such as prevented the linesman from correctly flagging Harry Kane offside in the dying minutes of last week’s game against Juventus. 

Or the randomness of probabilistic events which then saw his subsequent 8% header strike the post, rather than loop into the net.

A match result is decided by the actual outcome of a wide range of in game events, each of which have a probabilistic likelihood of successfully occurring in favour of one or the other side.

So is it better to favour a probabilistic view of events, as estimated by such metrics as expected goals or simply make predictions based on the actual outcomes that decided the result on the day?

Premier League teams each play 38 matches and potentially many more cup matches over a season and the more matches that are played, the more likely it is that seemingly unusual sequences of results may happen, simply by chance.

To take an example from the non-sporting arena.

Everyone would concede that coin flipping ten consecutive heads is a very unusual happening. 

But if you were prepared to flip a coin for long enough, probably hours, it is certain that you would eventually see ten heads (or tails) appear consecutively, just through randomness.

I’ve just run a simple simulation and the first instance of ten consecutive coin flips occurred after 1633 attempts.

If you chose to show someone a recording of these ten particular coin flips, but omitting the 100’s of outcomes prior to and following this wholly expected, but seemingly remarkable sequence, it would be hard for that person not to draw faulty conclusions about the true probability of a future individual coin toss.

However, over time, the apparent significance of a run of ten heads would be swamped by many more less extreme sequences and the proportion of heads and tails flipped would trend ever closer to 50%.

Humans invariably under estimate the likelihood of a coin flip resulting in runs of consecutive heads or tails. 

But if persuaded that runs of 10 or more will inevitably happen and that they’ve simply been shown a sequence from many hours of flipping a fair coin, they would hopefully conclude that the likelihood of the next flip being a head (or a tail) was still a 50/50 proposition.

Back in the sporting arena, by choosing to seek out these potentially atypical actual sequences of events, often reinforcing their impact by deliberately beginning the counting process after a success or a failure, we are compromising our ability to make an informed future judgement call.

In attempting to evaluate football teams, we don’t have the luxury of knowing outcome probabilities with the degree of confidence we have when dealing with fair coins.

Players carry injuries, mature and decline. Line-ups fluctuate, opponent, ground conditions and preparations differ. 

But just as a probability of 0.5 defines the chance of a fair coin being flipped heads, we can, albeit imperfectly, estimate the likely occurrence of many events in a sporting contest, based on modelling of independent variables against dependent ones that interest us from past occasions.

Expected goals models are used to illustrate the attacking process of both teams and players.

Harry Kane 2017/18 xG timeline

The plot above depicts each chance and the associated probability that has fallen to Harry Kane in the 2017/18 Premier League season.

The red bars represent actual goals Kane has scored and the blue ones are shots or headers that were either saved, blocked, hit the frame of the goal or missed the target entirely.

The red bars are Kane’s goal scoring outcome, whereas the red and blue bars represent his underlying process of attempting to score a goal.

Unlike a coin flip, the xG probabilities that a chance is converted fluctuates from lows of 0.03 for speculative long range shots to nearly 0.8 for penalty kicks.

But in a similar vein to coins being flipped, even over a single season, Kane has experienced barren periods and more productive, goal laden sequences.

It took him 25 attempts to break his Premier League duck, but if we selectively top and tail a 16 attempt mid-season period, he scores six times in that time scale.

And while randomness cannot be entirely invoked to explain these streaks, it is likely to have been a significant and largely transient factor in Kane’s scoring during 2017/18. No matter how appealing any narrative driven alternative explanation may appear.

An unexceptional coin, given enough time can appear to be capable of landing exclusively on heads, but a much better indicator of what will happen in the long run is the underlying probabilities of 0.5 for each outcome, rather than a series of selectively chosen outcomes.

And the same is largely true of Harry Kane and any other striker.

Randomness could make him appear either lethal or toothless, but his underlying process, as represented by the probabilities of the chances he is involved in, is a much better indicator of his future performance than a drought or goal glut that may have been heavily influence by luck.


Recent blog entries