Bayern Munchen vs Chelsea

The Most Competitive Football League through Data Analysis

The top European Football Leagues are the most qualitative of the world due to the high level of spectacle, the high budgets and the great popularity the clubs receive from the worldwide broadcasting. The question about which one is the most competitive of all is very common among the fans’ debates around the globe and the opinions seem to be poles apart.

For being able to support with sufficient arguments which the most competitive contemporary league is, certain statistics from the top-5 of European Leagues (Premier League, La Liga, Bundesliga, Serie A and Ligue 1) need to be processed. During this research, the samples that will be deployed are pertained to the point difference between the champion team and the runner-up of each league table, the difference between the champion team and the fifth team and the difference between the seventeenth and the twentieth team, from 2000 until 2017.

Then, the following steps have to be made:

• The examined samples per season will be analyzed through statistic graphs, by introducing an identifying picture of each league.

• The statistical method of standard deviation will be utilized, so as the level of competitiveness among the top leagues to be set in order.

• The samples will be divided in two different parts, from 2000 to 2008 and from 2009 to 2017. At that time, the statistic method of T-Test will be used, for examining the variation and the statistical significance or not of the samples through the seasons.

To begin with, there is one factor that complicates the analysis process of the samples. It is the fact that Premier League and La Liga have 20 teams, when Bundesliga has 18 teams. The complexity worsens if it is considered that Serie A and Ligue 1 had also 18 teams until 2004 and 2002, respectively. In order to avoid any kind of statistical perplexity, the stats of all leagues need to be normalized. That means that two hypothetical teams will be added in Bundesliga, in Serie A and in Ligue 1 (whenever it is necessary), with the average points of each league table so that all the five leagues will have the same number of teams.

Furthermore, it is crucial to be referred that point losses, such as Parma’s in Serie A (2015), or Juventus’ relegation as Italian champion, because of Mozzi’s scandal (2006), or any other external reason that affects any league table, are not taken into account. This research tries to come to a scientific conclusion and that is why the league tables need to be analyzed based on the results of the matches.

In the first graph, it is shown the point difference between the champion team and the runner-up of each league though the passage of the years. Premier League has a significant drop since 2001 and the average difference is 7.28 points per season. La Liga seems to be more balanced, with 5.5 points. Bundesliga has ups and downs, with 8.1 points. The same goes for Serie A, with 7.5 points, and Ligue 1, with 8.5 points. The last three Leagues are losing ground, as they have at least one season in which the champion team won the trophy with more than 20 points of difference from the runner-up, when Premier League’s biggest difference is 18 and La Liga’s 15 points.

Additionally, the situation looks unstable when the comparison between the first and the fifth team of each table takes place. In Ligue 1, the difference was consistently under the 25 points although, in the last 2 seasons, it is getting bigger and bigger. In La Liga, the difference was inconceivable high for 3 years (2009-12), but in the last quinquennium the stats spring back to regular levels. Premier League remains for over a decade the league with the firmly lowest point difference, while in Serie A the variances of the point difference are frequent. As for Bundesliga, it has the higher average point difference for 5 straight seasons (2013-2017).

In the last graph, the point difference between the last team of the table, which was relegated along with the eighteenth and nineteenth team, and the seventeenth team, which managed to be saved, is presented. The variances in the performance of the teams that try to avoid the relegation are foreseeable, as they cannot be expected to have a well-balanced season every year. Bundesliga has the greatest average per year in this case (10.28), with impressive stability during the last decade. La Liga and Serie A follow (11.33 & 14.61), while Ligue 1, more recently, and Premier League, in general, show how unpredictable leagues can be when the discussion comes to the bottom of the league table.

Every single graph of the above constitutes a unique competitive category. By using the standard deviation method, the leagues can be evaluated hierarchically in each one of these three categories. Standard deviation is a statistical term used to measure the amount of variability or dispersion around an average. The further the data points are from the mean, the greater the standard deviation.

The analysis between the first and the second team of the league tables leads to some interesting findings. La Liga is thought to be the most competitive League in this category (3.96), followed by Premier League, which comes second (4.54), Serie A (6.07), Bundesliga (6.45) and Ligue 1 (7.21). The extracted results from the standard deviation analysis demonstrate that if a random football fan wants to spectate a league with uncertain ending at the top, La Liga constitutes the No1 choice.

However, in the analysis between the first and fifth team things are different. Premier league is in the first place (6.55), Serie A trails in the second (7.48), Ligue 1 is third (8.05) and Bundesliga is fourth (9.48). La Liga is the less competitive among the top-5 league in this category (9.81), as there is distinctly lower level of competitiveness between the first and the fifth team or even below.

In the last category, between the seventeenth and the twentieth team, Bundesliga has the highest level of competiveness (4.00) and the rest standing is the following: Serie A (4.80), La Liga (5.10), Ligue 1 (6.20) and Premier League (7.33). In simple words, the teams which try to avoid the relegation have better chances to survive in Bundesliga, where the effort might maintain until the last matchday, than in Premier League, where the relegation issue ends much quicker and disgracefully for many teams.

Having provided scientifically the reasons why each one of the top-5 European Leagues is the most competitive in certain categories, there another question that has to be answered. Was the situation always the same or the variances were different? The proof of the statistical significance is the next scientific step so as to find the stability or the instability of the data collected. The most appropriate method in this occasion is the T-test method.

The T-test (also called Student’s T-test) compares two averages (means) and concludes if they differ from each other. The variances of the samples are unequal so the proper t-test is the heteroscedastic two tailed t-test for of the p-value calculation (Statistics: How to, 2016). The p-value is the evidence against a null hypothesis (Ho: m1 = m2). It is used in hypothesis testing and conduces to the approval or the rejection of a null. The confidence level is determined to 95%. The collected data will be divided in two parts, from 2000 to 2008 and from 2009 to 2017, for all the three categories which were analyzed above.

In Premier League, the statistical significance appears between the champion and the fifth team (p-value=0.03) as the average point difference has gradually dropped since 2009. In the other two categories, there is no statistical significance (p-value=0.08 between the champion and the runner-up | p-value=0.13 between the teams at the bottom of the league table). In La Liga, the difference is no significant either between the champion and the runner-up (0.48) or between the teams at the bottom of the league table (0.30). Nonetheless, the statistical significance is located between the champion and the fifth team (0.0004), as the average point difference has doubled, since 2009.

Consequently, in Bundesliga, the statistical significance is detected between the champion and the fifth team (0.04). The percentage increase of the average point difference from 2000-08 to 2009-17 surpasses the 145%. Ligue 1 and Serie A constitute the most stable among the top-5 leagues, as they do not have any significant difference in any of the categories reffered.

The findings from this scientific research establish some really interesting conclusions. The first one is that is not possible to describe a league as the most competitive from top to bottom. The segmentation in groups was helpful in terms of awareness about which one is more competitive than the other by categorizing the competiveness. Then, the use of standard deviation method made easier to define the hierarchical competitiveness among the leagues per category. Finally, the T-test method showed which leagues are well-balanced (stable data situation) and which have several changes and variances.


Bibliography

2011. Edupristine. [Online]
Available at: https://goo.gl/JFGPaQ
[Accessed 9 December 2017].

Deutscher Fussball-Bund. [Online]
Available at: https://goo.gl/V6oGkh
[Accessed 6 December 2017].

Khanacademy. [Online]
Available at: https://goo.gl/NNVrN4
[Accessed 10 December 2017].

La Liga. [Online]
Available at: https://goo.gl/7N47Fx
[Accessed 5 December 2017].

Lega Calcio. [Online]
Available at: https://goo.gl/SnVKTW
[Accessed 3 December 2017].

Ligue 1. [Online]
Available at: https://goo.gl/wgzDKV
[Accessed 8 December 2017].

Premier League. [Online]
Available at: https://goo.gl/SVgaUg
[Accessed 3 December 2017].

Soccerway. [Online]
Available at: https://goo.gl/wnUoxw
[Accessed 2017 December 10].

Panini ed., September 2005. Almanacco Illustrato del Calcio. In: La Storia 1898-2004. Modena: Panini Group.

Data oriented innovation constitutes half of the most amazing parts in my everyday life. The love and the obsession for sports constitute the other half. In Statathlon, I have found a unique opportunity to combine my passions through research, in order to change the way most people perceive athletism on the whole.

Site Footer