Using Expected Goals To Describe the Ebb and Flow of a Single Match

One of the inevitable difficulties when bringing so called advanced football metrics to a wider audience is the terminology used, as well as the presentation of the data.

Expected goals, often abbreviated to xG, has entered the mainstream of “Match of the Day”, but the name has been adopted almost by default and even within the analytics community there are dissenting voices.

Shot quality has been proposed as a more intuitive alternative, although even this may add to the confusion, as it appears to omit headers, when they are usually included.

“How likely is he/she to score from that field position” covers most interpretations, but it lacks brevity. 

For those not fully immersed in the insular world of football analytics, it is very easy to find the initial hurdles too remote from the reality of an actual game of football and interest is quickly lost. 

Equally, numerical descriptions of on field events, often quoted to a couple of decimal places, are used as a matter of necessity when attempting to gain spreadsheet based insights from the copious amounts of data produced by a typical match. 

But it’s difficult to relate to a “scoreline” of 2.25 xG against 1.67 xG in a single match, particularly if the actual result was a 1-2 victory for the latter side, as was the case in Manchester United’s home Champions League defeat to Sevilla.

Expected goals are at their most useful when aggregated into larger samples of matches and used to identify those sides whose process of chance creation may have been good, but through the randomness that exists in every natural process, they haven’t yielded a fair return for their efforts. 

“Luck” tends to be less extreme going forward, so a side’s process, as described by their expected goals numbers, is likely to be a more predictive estimate of their future results.

Does this mean that expected goals for single games aren’t useful?

Football commentators and pundits are often tasked with describing matches with very few scoring events. 

Typically, around a third of Premier League games reach half time stalemated at 0-0 and describing both the quality and quantity of chances that have been created by each side will be the mainstay of a halftime report, to give a flavour of which team has been the more dominant of the two.

Inevitably, such summaries may over or under estimate the likelihood that a chance turns into a goal, as subjective assessments are rarely consistent and are often blighted by outcome bias.

A shot from thirty yards out is successful around 2% of the time, but it is difficult not to be impressed if such a shot occasionally produces a spectacular save or drifts inches wide of the post and descriptive biases can creep into factual reports. 

Expected goals models largely overcome the problem of a subjective analysis. 

An additional benefit of having a numerically based assessment of goal attempts is that they can also be used to create thousands of alternative realities by running Monte Carlo simulations of each opportunity created.

Of particular importance to a game is scoring the opening goal. Not only does it give one side a lead in a low scoring contest, it also allows them to partly dictate the course of the remainder of the game.

Balancing the amount of resources committed to defending a lead against the potential rewards of actively seeking further goals, is an essential part of every manager’s skillset and leading a game is always preferable to drawing or trailing.

It’s a trivial computational task to run simulated outcomes of 20 or 30 goal attempts, so that we may quantify how likely each side is to have opened the scoring, as well as estimating the likelihood of a match still being scoreless.

Many have been critical of Jose Mourinho’s cautious approach to Manchester United’s second leg Champions League tie against Sevilla, following the goalless first leg.

Manchester United vs Sevilla shot map - as can be seen on the Infogol App (click to download)

As the Infogol plot shows, United created 2.25 xG compared to just 1.67 xG from their Spanish visitors, yet they lost the tie 2-1.

A more detailed examination of the xG timeline for the match not only highlights the uneven nature of United’s balance of risk and reward throughout the match, but the dangers that Mourinho’s initial passive approach invited.

Firstly, 56% of United’s total xG for the night came in the final 16 minutes as a response to Sevilla’s quick fire brace of late goals. By that stage United needed three goals without reply to qualify and Sevilla unsurprisingly chose to prioritize defence over further concerted attack.

Perhaps more telling, the distribution of the chances prior to Sevilla’s two goals in the 74th and 78th minute comprised of 25 low quality chances, 13 for the visitors and 12 for the home team.

On average, these 25 chances each had only around a 1 in 20 chance of any one resulting in a goal and we can chronologically simulate the chances in the match from the first whistle up to the two Sevilla goals to see which team was more likely to score first.                                                                 

Manchester United vs Sevilla: Simulation of who had the best chance of scoring first

Overall, United were more likely (42%) than Sevilla to have scored the opening goal from the kick off to the 78th minute, but in the reality of the evening the visitors were by then 2-0 up.

However, it was slightly more likely than not (58%) that either Sevilla led first or the game was still scoreless.

And for a side in Manchester United who were widely regarded as superior to Sevilla and had home field advantage, the xG simulations perhaps suggest that Mourinho had failed to balance his side’s risk verses reward for this match.

Mourinho can consider himself slightly unlucky to have fallen behind given the quality and quantity of chances created prior to the opening scores, but conceding first (28%) was hardly an insignificant potential scenario.

Use of xG in conjunction with simulations, therefore provides a powerful addition to the opinion based assessments of professional pundits with the added bonus of being based on xG, a more predictive metric than simply using goals scored or allowed.

Finally, the weight of xG pressure that United were able to bring to the last 16 minutes, adds an objective view to those who questioned Mourinho’s safety first tactical approach to the first 70+ minutes and the not unsubstantial risks he ran of falling behind in the tie by conceding lots of low grade chances, but creating little better from his own team.

Recent blog entries