Serie A: The Statistical Projection for the Winner of 2017/18 Season

The team’s formation in contemporary football is one of the most important factors that determines the team’s attacking and defending effectiveness. This research studies the importance of formation in Italian Serie A football the last 7 seasons and forecasts the winner of 2017/18 championship.

ABSTRACT

The research focuses on the Italian Serie A battle for the league title in current season – 2017/2018. The two teams that fight for the title are the Juventus FC and the Napoli FC. This research uses as a test factor the formation of the opponents of these two teams from the last 7 seasons (266 matches for each team). This sample also contains the scoring and the conceding goals. This data is used to determine the impact of opponent’s formation against these two clubs.

The research includes four parts:

  • The first part determines the level of the association between the scoring goals and the formation of each opponent the last 7 seasons.
  • The second part tests the level of association between the conceding goals and the opponent’s formation in the pitch the last 7 seasons.
  • The third part uses descriptive statistics to generate specific statistical probabilities that will be used for the final part of this research.
  • The final part reads the probabilities above to simulate the 2017/2018 season. The ultimate goal is to predict the winner of the title based on the effectiveness that these two teams had playing against variant formations the last 7 seasons.

LIMITATIONS

There are some limitations that the researcher took into consideration to prepare this study:

  • Any alteration of formation that might have been applied during the game is not considering by the researcher. The sample contains only the starting formations for each game.
  • The formation of Napoli FC is not affecting the result of the research and has been excluded by the researcher. The club deployed the formation 4-3-3 in 258 out of 266 matches (97%) so it is considered fixed.
  • The formation of Juventus FC does not affect the results of the research and has been excluded from the test. The club has deployed the formation 3-5-2 in 231 out of 266 matches (87%) so it is considered fixed.
  • The simulation about this season is based exclusively on the results that these two teams had the last 7 seasons. Consequently, the generated probabilities are positive only for goals that have scored or conceded in these seasons. For example, if Napoli FC has never scored 5 goals against 4-5-1 these 7 seasons, the probability of Napoli FC scoring 5 goals against 4-5-1 in the simulated season is 0.

PART Ι: FORMATION – SCORING GOALS ASSOCIATION

This part examines whether the opponents’ formation in the pitch had affected the scoring capability of either Napoli FC or Juventus FC the last 7 seasons.

The graph above illustrates the average scoring efficiency for each formation they faced the last 7 seasons. Napoli scored more goals on average against the 4-4-2, 4-5-1 and 4-3-3 formations while Juventus scored significantly more goals against 3-4-3 formation. The “Bianconeri” has scores slightly more goals against 5-4-1 formation too.

Napoli’s favorite formation to play against was 5-4-1, with 2,24 goals on average (47 goals in 21 matches) and its disliked formation to face was 4-5-1 with a scoring efficiency of 1.63 goals per match (129 goals in 79 matches).

Juventus enjoyed playing games against 3-5-2 formation, with 2.33 goals per match (28 goals in 12 matches) while the team did not like playing against 4-5-1 with only 1.40 goals per match (108 goals in 78 matches).

To test if the deployed formation of the opponent is associated with the scoring goals, the researcher used the χ2 test of Association. The Null Hypothesis suggests that the two variables had no association and the confidence level is 0.05. The result of this specific test regarding Napoli FC, is the following:

The p-value estimated at 0.01363 which is significantly lower than the 0.05 so the Null Hypothesis can be rejected. This result means that the opponent’s formation and the scoring goals of Napoli FC were statistically associated.

The following results describe the results of the same test for Juventus FC:

The p-value of χ2 test of Association estimated at 0.03308 which is lower than the 0.05 so the Null Hypothesis is rejected for Juventus too. Consequently, for Juventus FC the scoring goals were strongly associated with the opponent’s formation in the sample.

The researcher has proven above that scoring goals and opponent’s formation are strongly associated for both Napoli FC and Juventus FC the last 7 seasons. This statistical insight is significantly important for the final part that includes the simulation code.

PART ΙΙ: FORMATION – CONCEDING GOALS ASSOCIATION

This part examines the association between the opponent’s deployed formation and the goals that the two clubs (Napoli FC and Juventus FC) have conceded on each game the last 7 seasons.

The graph above shows the average conceding goals of Napoli FC and Juventus FC based on the formation that the opponent teams have deployed. According to the graph, Napoli FC conceded more goals on average against 3-4-3, 3-5-2 and 5-4-1 formations while they defended much better against 4-4-2, 4-5-1 and 4-3-3. Juventus FC was much more efficient playing against 4-4-2, 4-5-1 and 4-3-3 formations while they suffered against 3-4-3, 3-5-2 and 5-4-1 formations.


Napoli FC preferred to defend against 5-4-1 formation with only 0.86 conceded goals per match (18 goals in 21 matches). On the other hand, the club suffered defending against 3-4-3 formation with 1.7 conceded goals (34 goals in 21 matches).

Juventus FC was much more efficient when its athletes playing defensive against 4-5-1 formation with only 0.48 conceded goals on average (37 goals in 77 matches). Yet, when the opponents were attacking using 3-4-3 formation the team was conceded 1.23 goals on average (16 goals in 13 matches).

The test that the researcher used to determine if the deployed formation of the opponent was associated with the conceding goals, was the χ2 test of Association. The Null Hypothesis suggested that the two variables had no association and the confidence level was 0.05. The result of this specific test regarding Napoli FC, is the following:


The p-value of the association test regarding Napoli FC was estimated at 0.02706 which is significantly lower than the confidence level of 0.05. Consequently, the Null Hypothesis was rejected which means that the conceding goals were associated with the formation that the opponent team deployed against Napoli FC the last 7 seasons.

For Juventus FC the results regarding the χ2 test of Association were the following:


The p-value of the test regarding Juventus FC was calculated at 0.02151 which is lower than 0.05. Consequently, the Null Hypothesis was rejected meaning that conceding goals and opponent’s formation were strongly associated.

In this part, the researcher concluded that conceded goals for both Juventus FC and Napoli FC were strongly associated with the opponent’s formation the last 7 seasons. This statistical insight is also significantly important for the final part that includes the simulation code.

PART ΙΙΙ: THE PROBABILITY GENERATION FOR THE SIMULATION

The Simulation code requires two kinds of information: the probabilities regarding the formation of the opponent and the probabilities regarding the scored and considered goals per deployed formation. The data included 532 matches for both teams the last 7 seasons.

The table below shows the statistics regarding the deployed formations against Napoli FC and Juventus FC:

FormationNapoliJuventus
3-4-30.0750.049
4-4-20.2410.252
4-5-10.2970.289
4-3-30.2410.256
3-5-20.0680.045
5-4-10.0790.109

According to the table above, Napoli FC has played more matches against teams that used 4-5-1 formation (79 matches). On the contrary, the team managers of the opponent teams avoided to deploy their players in 3-5-2, 5-4-1 and 3-4-3 formations (18, 21, 20 matches respectively).


Juventus FC team played most of its matches against 4-5-1 formation (77 matches) while the 3-4-3 and 3-5-2 were rarely used by the team managers of the opponent teams (13, 12 matches respectively).

The same aggregation of the data has been applied for the goals per opponent’s formation. These numbers always sum up to 1 and are essential for the code in order to run the simulation process.

PART IV: THE SIMULATION OF 2017/2018 SEASON

This part includes the description of the simulation code that predicts the winner of current season’s Italian Serie A. The possibilities from the previous part will be used as a basic material for the projection. There will be two different code scripts that will be used: one for Napoli FC and one for Juventus FC.

The code will run the full Serie A season (38 matches) match by match. For every match the machine will generate the formation of the opponent team. The next step will estimate the scored goals and the conceded goals. A simple comparison of scored and conceded goals will determine the result of each game. After the end of the season the machine will re-run the 38 matches 1,000,000 times in order to increase the accuracy of the simulation process. The programming language that has been used for the simulation code was Python.


You may find the codes here: Napoli FC, Juventus FC


For Napoli FC the simulation results showed that the team at the end of the season will gather approximately 73 points (73.45). For Juventus FC the projected points are 85 points (84.85), 12 more than Napoli FC. Consequently, the simulation code suggests that Juventus is the projected winner of 2017/2018 season by 12 points.

RESULTS & DISCUSSION

Juventus FC is projected to overcome Napoli FC by 12 points in the final league table in current season. The team of Allegri seems to be more consistent playing against variant formations. On the contrary, Sarri’s team shows significant discrepancies between the attacking and defending efficiency for every formation that the team has faced in the past. These discrepancies have affected a lot the projected gathering points that the machine produced.

This research produced significant statistical insights regarding the two main competitors for this year’s Serie A league title. The simulation showed that Napoli FC faces difficulties when they face multiple formations. On the other hand, Juventus seem to be more consistent playing against each possible deployed formation. Consequently, Napoli FC team has to work hard on how to defend effectively against the teams that use formations with 3 defenders.

SUMMARY

The research included three major sectors. Firstly, the researcher proved that there is strong statistical association between the opponent’s deployed formation and scoring goals. The next sector included the same test but for conceded goals. Both teams were associated with the opponent’s formation on attack and defense. The next part illustrated the probabilities and the mechanics that the simulation code used. The final part included the simulation code that concluded that Napoli’s team attacking and defending discrepancies will eliminate the team for the race of this seasons league title in the Serie A.

BIBLIOGRAPHY

ESPN, n.d. 2017-18 SERIE A STANDINGS. [Online]
Available at: https://goo.gl/HnVKeU

Statistics Solutions, n.d. Chi-Square Test of Independence. [Online]
Available at: https://goo.gl/zaTllx

Who Scored, n.d. Napoli Statistics. [Online]
Available at: https://goo.gl/fBrC1W

WhoScored, n.d. Juventus Statistics. [Online]
Available at: https://goo.gl/Ev7C8m

Latest News

Read the latest launches, collaborations and partnerships