Player Vectors: Characterizing Soccer Players' Playing Style

Jesse, Tom, Pieter
November 12th, 2019 · 6 min read

Welcome to the DTAI Sport Analytics lab’s blog!

In our first blogpost, we give an intuitive explanation of our recent ECMLPKDD 2019 paper on characterizing soccer players’ style of play based on event stream data, provide illustrative use cases and release a small interactive demo.

A player’s style of play is a reoccurring concept when discussing soccer as each player approaches the game in a different way. Eden Hazard, for example, likes to drop deep in the midfield from the wide left wing position to find the open space and then use his excellent dribbling skills to run at the defense. This is in contrast to other wingers such as Mo Salah and Sadio Mané, who prefer to receive the ball behind the lines and exploit the space with well-timed runs.

This is the alt text medium image Hazard's goal against West Ham is a nice illustration of his typical style of play. He picks the ball up close to the halfway line, drives in field past a couple of defenders and then scores with precision and style.

Fans often form affinities for players based on their perceived style of play. Similarly, teams are interested in identifying players that fit their tactical vision. Unfortunately, playing style is inherently a subjective and nebulous concept, which makes it difficult to capture it in a data-driven manner. We believe that a player’s playing style arises from the interplay between his skills and the tactics employed by team he plays for. A good manager should seek to accentuate a player’s strengths by placing him in positions that enable to perform the actions that he is best at. Therefore, one way to characterize style of play is to look at a player’s behavior during a match in terms which actions he prefers to perform in various locations on the pitch. Consequently, we can gain insights into playing style if we can capture the action-location relationship for each player. Doing so requires addressing the following two questions:

  1. Which actions should we consider in our analysis?
  2. How can we model the relationship between a player’s preferred area(s) on the field to occupy and which actions he tends to perform in each location?

Selecting actions to analyze

Our playing style analysis focuses on four types of offensive actions performed during open play: shots, passes, crosses and dribbles. We select these actions for three reason. First, these actions tend to be performed proactively whereas actions like tackles are more reactive. Second, defense is primarily about positioning which requires camera-tracking data, which we do not have access to. Third, by only considering open-play actions we exclude set pieces (e.g., free-kicks and throw-ins) which tend to be performed by predefined players such as set-piece specialists for free kicks or fullbacks for throw-ins. While extremely interesting and relevant, we believe this would require a separate analysis.

Modeling the action-location relationship

To capture the relationship between where on the pitch a player tends to perform each action, we build a heatmap for each player and action type:

  1. Counting. We overlay a grid of size m×n on the pitch. We then count the number of actions of this type that the player performed in each cell.
  2. Normalizing. As raw action counts will depend on the number of minutes played, we normalize the counts to per 90 minute averages.
  3. Smoothing. The count of each cell is replaced by a weighted average of its count and the counts in its neighboring cells. This promotes spatial coherence in the locations where the actions were performed, which may have been disrupted by the abrupt and arbitrary boundaries created by overlaying a grid on the pitch.
1. Counting2. Normalizing3. Smoothing
The first panel shows the raw-count matrix for Eden Hazard's dribbles, which is subsequently normalized for the number of minutes played in the second panel. The third panel shows the final smoothed version. Notice how the raw counts heatmap is spikier as there are abrupt differences between neighboring cells. In contrast, there are smoother transitions on the right. Darker colors correspond to higher values. Hover over the heatmaps to show the underlying values.

Working with the full heatmaps for each player is too cumbersome as they are quite large, which makes it difficult to glean insights into player behavior. Hence, we need to reduce or compress this fine-grained representation into something more coarse-grained without losing too much information. We do this by applying non-negative matrix factorization or NMF to the heatmaps. Intuitively, the effect of NMF is the following:

  1. It groups or clusters similar locations on the pitch in a data-driven manner. Here, the similarity between locations is determined by how often a specific action tends to be performed in it.
  2. It assigns a positive real-valued number to each player for every cluster constructed in the first step. This value captures how often he performs a considered action in that cluster. Higher numbers equate to more often.
NMF×PlayersMessiRonaldoLewandowskiLukakuSalahModricDe Bruyne
When applied to shots, the NMF procedure decomposes the heatmaps into four meaningful clusters (close shots, right shots, shots from the left and shots from the right) and assigns a number to each player for every cluster that captures how often he performs a shot in that cluster.

This procedure is applied to each of the four relevant action types, which leads to 4 shot clusters, 4 cross clusters, 5 dribble clusters, and 5 pass clusters. Finally, we construct a player vector for each player by concatenating all his values together in a vector, which forms a length-18 player vector.

What can you do with this analysis?

The transformation of a nebulous concept like playing style into a compact mathematical, yet interpretable representations enables many interesting applications. Below we illustrate how player vectors can be used to scout a replacement for Barcelona’s midfielder Sergio Busquets, to monitor the evolution of Kevin De Bruyne into a deep-lying playmaker and we outline what makes Messi the GOAT. In our interactive demo you can explore by yourself how these use cases apply to twenty other high-profile players.

Scouting

Sergio Busquets has fulfilled a pivotal role at the heart of Barcelona’s iconic midfield and hasn’t dipped below 31 La Liga appearances since the 2010/11 campaign. However, age is catching up with the Spaniard and his most recent performances have shown that he is not on the level that he once was. Therefore, it is reported that Barcelona will be signing a younger defensive midfielder next summer.

We can quantify the similarity between two players as the Manhattan distance between their player vectors. As such, we identify three young players with a similar playing style to Busquets: Rodrigo Hernández, Wylan Cyprien and Marc Roca. The first one comes as no surprise. Rodrigo is known as “the new Busquets”. Actually, Barcelona allegedly competed for Rodri’s signature, but it was Guardiola’s Manchester City who eventually acquired him last summer.

Espanyol midfielder Roca is compared with Busquets regularly too. His passing and dribbling are extremely similar to Busquets’. Perhaps more surprising is Cyprien, who alternates between offensive and defensive midfield positions at Nice. He complements Busquets’s playing style with a great long-distance shooting ability.

Finding a replacement for Sergio Busquets

Player development

Kevin De Bruyne perhaps best exemplifies what is expected of a modern deep-lying playmaker. He drops deep to help build from the back. He drifts wide to find space and swings crosses behind the back line. However, he hasn’t always played like this. As a younger player, he always played more in a wide position, on the right or the left. When he arrived at Wolfsburg in the winter of 2014, he started to operate more as the number 10 role, allowing him to be the creative hub of the team. It is only since 17/18 that Guardiola started using De Bruyne as a deep-lying midfielder.

This evolution is clearly visible in De Bruyne’s player vectors. Over the years, the vector components that correspond to flank dribbles and backline crosses have decreased consistently. In contrast, De Bruyne now shoots more often from distance and performs more central passes.

Kevin De Bruyne evolved from a 10 to a deep-lying playmaker

Combining player vectors and VAEP

In earlier research, we developed a model to value a player’s impact on the game. The model generates a rating called VAEP (Valuing Actions by Estimating Probabilities), which indicates how a specific player’s actions contribute to the game score. This way, you can evaluate a player’s performance and his contribution to the team in greater detail. Using these ratings, we proved for once and for all that Messi is the real GOAT.

Now, we can link these ratings with Messi’s playing style. Therefore, we no longer count the number of actions in each grid cell, but instead sum the ratings of all actions in each cell. Doing so, we can capture the action-location-performance relationship of Messi’s playing style. In the figure below, we again compare Messi against Ronaldo. On the left we show the original player vectors, while the player vectors on the right are based on the summed VAEP ratings.

What makes Messi the GOAT

A first obvious insight is that Ronaldo scores better on components that are associated with the left side of the field, while Messi scores better on components associated with the right side of the field. That is because Ronaldo operates mainly from the left, while Messi operates more from the right. Nevertheless, in terms of crossing, Ronaldo is clearly creating more danger than Messi.

In all other offensive components, Messi excels. Based on the amount and quality of center dribbles and passes he performs, Messi has the playing style of a world-class creator. No other player bags more assists or creates more space and threat by dribbling. Yet, what makes Messi truly unique is that he combines this playing style of a creator with superior shooting skills. Relatively speaking, Messi does not shoot on goal that often. Ronaldo, who has the playing style of a finisher, tends to shoot on goal more often. Especially from a close range. Yet, Messi’s shots have a much higher quality. In short, Messi has the style of a creator with the skills of a finisher.

More articles from DTAI Sports

Introducing Atomic-SPADL: A New Way to Represent Event Stream Data

We introduce Atomic-SPADL, an alternative representation for soccer event stream data and computing VAEP values.

May 5th, 2020 · 4 min read

Exploring how VAEP values actions

Introducing an interactive tool to explore how VAEP values player actions in soccer.

April 27th, 2020 · 2 min read