Is there a simple and accurate way to predict if a team can win a game? Which team statistics play the most important role in a game? The so – called Four Factors of Basketball is a term introduced less than two decades in order to give an answer to previous questions.
ABSTRACT
This research reviews some of the factors that have the biggest impact on a basketball game. The main objective is to define which are they and in which way they can be used. In the first section, the author examines a theory that has been popular over the last few years. Using a projection model based on those factors, an estimated number of wins is calculated. It is hoped that the results and the relevant analysis will inform the reader about their importance and accuracy as a measure of success.
INTRODUCTION
Providing accurate team wins projection reports is one of the biggest challenges in basketball, and not only. Coaching staff and analysts, data scientists and other stakeholders with relevant technical background combine their skills every year. Their goal is to find which player or team stats are the most important and in which way. After many decades of collection and process of data, nowadays it is possible to underline the different emphasis that they have and create efficient projection models based on them.
DATA COLLECTION & METHODOLOGY
For the purposes of this research, data from season 2012/13 to 2016/17 for all 30 NBA teams was collected and used. Eight categories were taken into account, which can be splitted into offensive and defensive stats. Only regular season was examined.
Three mathematical equations, based on two different models, were used to produce the outcome, which was the predicted record for each team from 2012 to 2017 (150 observations in total). The linear least squares method of regression analysis was applied to the second one. The results were then compared with the actual number of wins.
FOUR FACTORS EXPLAINED
Hours of analysis have been spent in NBA to find out what are the key factors to win a game. Dean Oliver, a data analyst and basketball coach who has worked for Seattle Supersonics and Denver Nuggets among others, developed a theory between 2002 and 2004 that was about to change the way that numbers and basketball are associated. According to his “Four Factors” theory, those aspects are based on how a possession could possibly end. Efficiency in shooting, turnovers, rebounding and free throws can bring wins to teams. They are measured using four team stats, with different weight assigned to each of them. Effective Field Goal% or EFG% (40%), Turnover Rate or TOV% (25%), Offensive Rebound Rate or ORB% (20%) and Free Throw Rate or FTR (15%) were defined by Oliver. Those factors are applied to defense as well.
This theory actually seems to be quite close to reality, because they take into account the fundamental principles of basketball. Score a lot, do not turn the ball over, grab every rebound and draw shooting fouls.
- Effective Field Goal% is an alternative to FG%, which inflates when the team makes three pointers.
- Turnover Rate is an estimation of turnovers committed by a team per 100 possessions.
- Offensive Rebound Rate is the percentage of contested rebounds that a team grabs after its own missed shots.
- Free Throw Rate is the number of free throw attempts per every field goal attempt. Apart from easy points, free throws contribute towards opponent’s foul trouble.
Their use has gained so much popularity that they made their way even into broadcasting networks. For example, Trevor Fleck who was producing telecasts for Minnesota Timberwolves, used the four factors as part of the halftime show.
PREDICTING WIN RECORD USING OLIVER’S FORMULA
His theory is applied very often to give a team wins projection. Through various algorithms that are able to predict the player’s expected record at each of those four (or eight as explained) categories and how they contribute towards the total team record, it is possible to build models that provide estimations. The Four Factors are the basis for many of them. Below it will be deployed whether or not they are actually game changers and if coaching staff should focus on them.
To do so, an algorithm will be run that predicts the wins of all NBA teams the past 5 seasons. The most simple version of the equation would be the following one :
using the different weights assigned to each factor by Oliver. Despite the majority of scientific and basketball community agrees with this hierarchy, there has been a debate about the exact weights he came up with. Therefore, the algorithm will be run twice. Once with known coefficients and once with normalized coefficients.
Using the 40/25/20/15 values, the equation returns the results that are displayed in the below scatter chart.
At first look it seems that the claim about their importance is quite fair. If a team is efficient in those statistical categories, it is more likely that it will earn wins. However, it is very far from reality. All teams were projected to have between 121 and 179 wins, while the total number of games is 82. The initial coefficients are not correct in that case, neither the standard error is taken into consideration.
A DIFFERENT PROJECTION APPROACH
This leads to the conclusion that even though Oliver’s initial weights assigned to each factor are good, they are not perfect. Basketball has been changing over the years and as a result many team stats increase or decrease as well. Free throw rate is the most typical example. League average back in 2005 was 0.248. Last season it was 0.196. This is the reason why the algorithm will be run again. This time however with different values which are expected to be more representative.
Before transforming the previous equation, it is important to present the methodology. There is an approach used in regression analysis of statistical modelling called the method of least squares. It is a mathematical procedure for finding the best – fitting curve to a given set of points by minimizing the sum of the squares of the offsets of the points from the curve. Otherwise, to find the best equation of the following format
where only the values of variables (the Four Factors and the number of wins in this case) are known. The aim is to identify an accurate estimation of the weights assigned to each factor.
REGRESSIONS ANALYSIS RESULTS
Performing a regression analysis to 150 observations (30 teams win records from 2012-13 to 2016-17 season), the value of adjusted R Square, known as adjusted coefficient of determination, is equal to 0.938143 . This means that approximately 94% of the variations of y – values around the mean are explained by the x – values. As a result, 94% of the values fit the model. The standard error of the regression is 3.2 which can be interpreted as the average difference between projected and actual wins.
Multiple R | R Square | Adjusted R Square | Standard Error | Observations |
---|---|---|---|---|
0.9685779484 | 0.9381432421 | 0.9346336388 | 3.2442038507 | 150 |
Moreover, by setting the confidence level at 5%, the p – value returned for each value is no greater than 0.05 . Therefore, the null hypothesis is rejected which means that all variables are significantly important and should not be omitted.
The coefficients returned for each variable, interpreted as the weights assigned, provide some interesting conclusions. The most important is that they confirm Oliver’s hierarchy. However, they do not fit to the 40/25/20/15 theory. In fact, they seem to be closer to something like 43/39/10/8 as underlined by the t Stat value of each variable. Thus being said, importance of turnover rate has increased by 56% while offensive rebound rate and free throw rate have decreased by 100% approximately. It is also interesting that offence has a greater impact then defense.
Variable | Coefficient | Standard Error | t Stat | P – Value |
---|---|---|---|---|
Offensive EFG% | 3,8375898427 | 0,1411352543 | 27,19086639 | 6,10753664347623E-058 |
Offensive TOV% | -3,8734703697 | 0,3131420826 | -12,3696896228 | 2,92500669149445E-024 |
Offensive REB% | 1,0699912542 | 0,1092307378 | 9,7956973972 | 1,33173505105201E-017 |
Offensive FTR | 0,6412191204 | 0,122644431 | 5,2282775126 | 6,04629963874523E-007 |
Defensive OEFG% | -3,9546542049 | 0,1700291715 | -23,2586806716 | 4,12794658311408E-050 |
Defensive OTOV% | 3,1200442895 | 0,2903756209 | 10,7448561973 | 4,80014691110578E-020 |
Defensive OREB% | 0,7989292875 | 0,1635409102 | 4,8851953087 | 2,76823970330772E-006 |
Defensive OFTR | -0,7830838501 | 0,1436094123 | -5,4528727445 | 2,1584030840349E-007 |
THE ACCURACY OF FOUR FACTORS MODEL
Despite not being perfect, this model is very accurate as demonstrated by the previous facts and this graph. Only in 3 out of 150 cases the projected and the actual number of wins differ more than 7 wins. The worst projection was about Memphis Grizzlies in 2015 – 2016 season : they won 42 games but the model predicted 34.
Predicting team wins using the Four Factors is considered to be more efficient than using other team stats. For example, the following scatter illustrates the results of a projection model based in offensive and defensive efficiency.
It is obvious that is slightly worse than the previous one since there are cases with great deviation. It predicted just 15 wins for Pacers three years ago, when their record was 38 – 44. This assertion is also confirmed statistically by the values of adjusted R Square and standard regression error.
CORRELATION WITH TEAM PERFORMANCE
Using simulations that process data collected for many decades, it is possible to project a player’s or even a team’s stats in many categories. And that is what makes the Four Factors strong : that a team can rely on them to realize its strengths and weaknesses. Thunder’s 2015 – 2016 season is a good example. They had a turnover rate of 14%, which was the 7th worse in the league. The main reason was the high TOV% of Westbrook (16.8%), one of the worst ratios in NBA for superstars. Meanwhile, they led the league in Offensive Rebound Rate with 31%. Kanter’s contribution to this honor was remarkable. He was league leader in ORB% that season. Even players such as Adams and Westbrook averaged a high ORB% and DRB% respectively.
An assumption will be made that they had a player who would perform exactly the same as Westbrook, but turned the ball less, reducing their total turnover rate to 13%. Then the four factors projection model (which predicted 57.7 wins and they achieved 55 actually), predicts 61.6 wins approximately. Respectively if Kanter or Adams were replaced by similar players but with worse ORB% and therefore reducing team’s total to 29%, the result would be 55.4 wins.
THE NEXT BIG THING
As demonstrated earlier, this model has been quite popular. Not only due to its simplicity but also due to its accuracy. The use of Four Factors will likely be powerful for many years. Nonetheless, there is always room for improvement. Real plus – minus and its initial version, the box plus – minus, are two recently introduced indexes that rise new challenges. According to ESPN which already built its own projection model, “RPM estimates how many points each player adds or subtracts, on average, to his team’s net scoring margin for each 100 possessions played”. It takes analysis one step further, as it takes into account all aspects of a player’s performance, even those which cannot be measured.
RESULTS & DISCUSSION
The outcomes of the aforementioned method indicate that eFG%, TOV%, ORB% and FTR are associated with team success. Yet, as demonstrated each one of them has different weight, in some cases significantly higher or lower than the initial assumption. Findings of the research support that those four factors can produce projection models with an accuracy level of approximately 94% in average. 24% of the predictions made for 150 cases were totally accurate and 61.5% within 2 wins.
CONCLUSION
To sum up, the four factors of basketball were analyzed in this study. Applying two different projection models, the win record for each NBA team over the last five seasons was predicted. Results showed that those factors have a big impact in basketball of current era and are directly linked to team success.
BIBLIOGRAPHY
Basketball-Reference.com. (n.d.). Basketball Statistics and History | Basketball-Reference.com. [online] Available at: https://goo.gl/DF6xbo [Accessed 25 Feb. 2018].
Strauss, Factor, Laing & Lyons. (2005). What Wins Basketball Games, a Review of “Basketball on Paper: Rules and Tools for Performance Analysis” By Dean Oliver. [online] Available at: https://goo.gl/w8HxcP [Accessed 23 Feb. 2018].
Keith, C. (2016). You Need to Be Recording these 4 Stats to Win More Games. [online] Coachbase basketball drills and practice planning. Available at: https://goo.gl/nft6cT [Accessed 23 Feb. 2018].
Squared Statistics: Understanding Basketball Analytics. (2017). Introduction to Oliver’s Four Factors. [online] Available at: https://goo.gl/PXJiUw [Accessed 25 Feb. 2018].
Partnow, S. (2016). Q&A: Minnesota Timberwolves Analyst Jim Petersen on Analytics, Broadcasting and Coaching in the WNBA. [online] FanSided. Available at: https://goo.gl/pGWHqU [Accessed 23 Feb. 2018].
Mathworld.wolfram.com. (n.d.). Least Squares Fitting — from Wolfram MathWorld. [online] Available at: https://goo.gl/oNsy9N [Accessed 24 Feb. 2018].
ESPN.com. (2014). Ilardi: How real plus-minus (RPM) gauges players. [online] Available at: https://goo.gl/C3xDwx [Accessed 23 Feb. 2018].
Pelton, K. (2017). Projected NBA standings: New playoff contenders, W-L records for all 30 teams. [online] ESPN.com. Available at: https://goo.gl/NAoLBg [Accessed 24 Feb. 2018].