In soccer, substitutions are a manager’s main asset to directly influence a match. By strategically substituting players on the field with fresh bench players, the technical staff can introduce new tactics, make crucial adjustments, and possibly shift the momentum in their favour. In this post, we will study how COVID altered substitution dynamics, assess the general impact of substitutions on goal-scoring and explore the realm of predicting impactful player swaps.
Our analysis draws upon publicly available data sourced from fbref.com. We consider the top men’s European leagues that allowed five substitutions per match throughout the 2020/21 and 2021/22 seasons. These leagues include Spain’s LaLiga, Italy’s Serie A, France’s Ligue 1, Germany’s Bundesliga, Portugal’s Primeira Liga, and The Netherlands’ Eredivisie. For each match, the dataset encompasses a report of the main events including substitutions, bookings and goals, a lineup sheet with player information and positions, and a shot record featuring xG data.
Are substitutions done differently pre- vs. post-COVID?
With the resumption of competitions after the COVID-19 outbreak, a significant shift occurred in the rules governing substitutions. The previous allowance of three substitutions per match was expanded to five. However, these five changes still have to be executed in three freely chosen substitution windows. This rule change raises the question of how managers have adapted their approach to substitutions. In this blogpost, we investigate two key aspects: the type and timing of substitutions.
To analyze the type of substitutions, we categorize a substitution as neutral if the outgoing and incoming player play in similar positions and as offensive (defensive) if a more attacking (defending) player replaces a more defensive (offensive) one. The offensive and defensive substitutions typically signify a deliberate tactical choice, while injuries, fatigue and bookings are the typical factors that drive neutral substitutions.
In Table 1, we examine the impact of the increased number of substitution opportunities on the type of substitutions. It is noteworthy that despite the rise in overall substitutions after the rule change, the absolute number of non-neutral (i.e., tactical) subs initially remained relatively stable. However, the percentage of tactical subs decreased, primarily due to an increase in substitutions motivated by factors such as fatigue and injuries. Only in the 2021/22 season, with clubs having had time to adapt to the new rule, did we witness an increase in the absolute number of tactical changes compared to the pre-COVID seasons.
Season | Offensive subs percentage | Defensive subs percentage | Neutral subs percentage | Non-neutral subs per match | Subs per match |
---|---|---|---|---|---|
2018/19 | 13.4% | 11.2% | 75.4% | 0.72 | 2.93 |
2019/20 pre-COVID | 12.8% | 11.4% | 76.2% | 0.70 | 2.87 |
2019/20 post-COVID | 8.6% | 9.4% | 82.0% | 0.77 | 4.29 |
2020/21 | 8.5% | 8.1% | 82.8% | 0.72 | 4.17 |
2021/22 | 12.9% | 11.3 % | 75.6% | 1.1 | 4.33 |
Table 2 shows the average time in the match that each substitution is made. For the seasons with five subs per match allowed, we counted the substitution windows used in second halves, as first-half-subs are typically motivated by extraordinary circumstances. With the rule change, substitutions happen a bit later and in more compact time windows, but the effect is diminished for the later substitutions.
First sub | Second sub | Third sub | |
---|---|---|---|
3-subs | 60 ± 10 | 74 ± 8 | 83 ± 7 |
5-subs | 65 ± 7 | 77 ± 7 | 85 ± 5 |
How does the generated xG evolve after substitutions?
The second half of soccer matches often features a surge in goal-scoring opportunities, accounting for approximately 55% of goals and xG. While it is commonly believed that substitutions play a role in this increase, it is hard to definitively prove that these substitutions are the cause, or whether it is simply that play becomes more open and teams intensify their pursuit of a favorable result as time runs out.
To shed light on this matter, instead of comparing the first and second halves, we analyzed the scoring intensity after each substitution. Figure 1 shows the evolution of average xG per 90 minutes for teams after each of the three substitution windows in the 2020/21 and 2021/22 seasons, on a league-by-league basis. With an increase in xG after the first and second substitution windows, we observe that substitutions generally have a positive effect on scoring probability. However, the overall scoring chances decrease after the third and last substitution window. These substitutions usually happen in the last minutes of the match, where the opposition has tactically adjusted to the previous subs, and they are frequently used for time-wasting purposes.
![xg per sub](/sports/static/26423a0cb5f06f9c04ccc84016d39ae0/4ceba/xg_per_sub.png)
What is the performance of starting vs. substitute players?
To assess the performance of players, we leveraged the detailed shot information data on fbref.com, which includes xG and Shot Creating Actions (SCAs) comprising the two previous actions by teammates leading up to the shot. As an estimate of a player’s total goal-scoring contribution, we compute his expected value (xV) as the sum of the xG of his shots and SCAs. It is important to acknowledge that xV provides a more limited evaluation compared to more advanced metrics that require full event data. Nonetheless, Figure 2 shows the average xV generated by midfielders and forwards, depending on the minutes played and differentiated by starting XI and substitutes.
![start vs sub](/sports/static/86aabacafc2d55a8e44f5536b61fa0db/2e699/start_vs_sub.png)
Figure 2 reveals an intriguing finding that might seem counterintuitive at first glance. Substitutes tend to generate more xV than players in the starting XI. Even though we expect the better players to start the match, the context in which substitutes enter the match affects their xV performance. Substitutes often enter in the second half when starters are more fatigued and the match becomes more open, thereby also increasing the number of goal-scoring opportunities. Additionally, the analysis reveals that the xV per minute does not decay with time, indicating that players maintain a consistent ability to generate xV throughout the match, with similar levels of productivity observed in both the first and last minutes. This suggests that any decline in physical performance is compensated by a generally higher goal-scoring intensity.
Can we predict whether a substitution will be useful?
To predict the impact of substitutions, we look at the expected goals difference (xGD), a metric that provides insights into the scoring chances generated by both teams and thus indicates match momentum. Based on xGD, we classified substitutions as useful or not depending on the change in xGD after the substitution was made. Specifically, we label a substitution as positive if the xGD increased by 0.5 or more negative if the xGD decreased by at least 0.5, and neutral otherwise. We describe each substitution using three types of features. The match-specific features are goal difference and xGD when the substitution is made. The substitution-specific features are the type of substitution (offensive, defensive, neutral) and the minute of the substitution. Finally, the team-specific features are each team’s ELO rating and home and away status.
![predict method](/sports/static/561a72c860970fb86e04c47791bd23bb/3a6ce/predict_method.png)
After experimenting with a set of models, we achieved the best performance using a kNN model. Data prior to the current season is the training data and tune the value of k using 10-fold cross-validation. On the ongoing 2022/23 season, our model achieves an accuracy of 74% whereas simply predicting the most common class (neutral) would yield an accuracy of 55%.
Taking the Barcelona vs Atlético de Madrid game on 23rd April 2023 as an example, we can see the predicted effect of the subs. In the 61st minute, Xavi made a double substitution, with Pedri and Eric Garcia coming in for Marcos Alonso and Ferran Torres. This defensive change was predicted to have a negative impact but the scoreline did not change after the substitutions. Cholo Simeone tried various substitutions to come back, but only the offensive double change in the 59th minute was a predicted to be a favorable substitution. Indeed, most of their xG was generated after this sub. They had 0.3 (on 6 shots) before the 58 minute and 1.0 afterwards (on 7 shots) — with most of this coming on an attempt by Griezmann with an xG of 0.51. The neutral substitutions in the 67th minute and 79th minute were predicted to be inconsequential for the result of the match.
![prediction example](/sports/static/77827f143eabbd58d0edc73068ecfdce/75486/prediction_example.png)
Conclusion
In this study we wanted to evaluate substitutions and their impact on the game. Since the COVID outbreak, regulatory changes around substitutions have provided managers more flexibility and strategic options. We have seen how these changes have affected the number of tactical interventions per match and altered the timing of substitutions.
One notable effect of substitutions that emerged from our analysis was their impact on goal scoring. We observed a higher rate of xV by substitutes, indicating their ability to contribute significantly to scoring opportunities. Furthermore, there was a notable increase in xG after the first and second substitution windows, underscoring the positive effect of fresh legs and tactical adjustments on a team’s scoring potential.
Building on these insights, we trained machine learning models to predict the outcome of substitutions. Remarkably, our models accurately predicted the impact of almost three out of every four substitutions.
Overall, our analysis underscores the significance of substitutions and their ability to shape the outcome of a soccer match. By embracing a data-driven approach, teams can extract the maximum potential from their squad.
Further Reading
This blog is based on Joan Hernanz i Ibáñez bachelor’s degree thesis. The full text is available online.
Furthermore, the articles below also delve in the statistical analysis of substitutes and provide further insights and perspectives on the impact they can have in soccer:
- “Smart Use of Substitutes Can Make a Difference” by Colin Trainor (StatsBomb): https://statsbomb.com/articles/soccer/smart-use-of-substitutes-can-make-a-difference/
- “Bundesliga Teams Could Be Using More Substitutions, But They Aren’t” by Ryan O’Hanlon (FiveThirtyEight): https://fivethirtyeight.com/features/bundesliga-teams-could-be-using-more-substitutions-but-they-arent/
- “The right man in the right place? Substitutions and goal-scoring in soccer” by Simon Amez, Brecht Neyt, Frederik Van Nuffel and Stijn Baert: https://www.sciencedirect.com/science/article/pii/S1469029221000169