Editor’s Note: This is the last column of a three-part series exploring where NBA statistical analysis is going in the future. In Part One, we examined the "old way", so to speak, linear-weights formulas. Part Two looked at better methods for evaluating defense. We wrap things up by considering alternatives to linear weights and exploring how different methods can be used in conjunction with each other.
In Part One of this series, I laid out what I think was a pretty solid case against using linear-weights formulas.
I also, to some extent, shot myself in the foot.
You see, while linear-weights formulas aren't that accurate and miss many nuances of a player's performance, they are valuable. In Basketball on Paper, Dean Oliver paraphrases Bill James to say, "reducing quality to one number has a tendency to end a discussion, rather than open up a world of insight."
But James also wrote an entire book about his "single number", Win Shares.
There is value in having that kind of single number. It can help find similarity or dissimilarity between players in terms of value much more easily than a set of numbers can. If one player has a PER of 20, and another has a PER of 10, you can bet the first player is likely the more productive one. In a setting like this column, single numbers are particularly useful to back up the claims I'm making, and the inability to use my linear-weights work in that role was the only reason I hesitated to write this series of columns. (Single numbers are also useful for studies, and it's unlikely I'll stop using VORP in that role.)
With that in mind, I'm going to introduce the rating system I'll be favoring in this column from now on, which doesn't really have a name other than its summary number, WARP -- Wins Above Replacement Player. Along with Oliver's Individual Win-Loss Records, I think it looks more like what summary statistics we'll see in the future will look like.
For more than two years, I've been working on the idea of rating players in a team context. That is, I'd like to consider what would happen when you added a given player to a team of four completely average players (with, assumedly, a completely average bench). Baseball Prospectus uses a similar theory for rating hitters, though it's much easier to do in baseball because there is relatively little interaction between teammates on offense.
It took me nearly that entire time to produce workable results. It wasn't until the middle of this season that I finally got results I was comfortable with. I'm not going to explain in complete detail how this rating system works. If you'd like to see that detail, here is a longer explanation.
Offense isn't too tremendously complicated. I start with points produced per 100 possessions, where possessions are FGA + (.44*FTA) + TO. To this, I add credit for assists and take away credit for an estimated proportion of assisted baskets, using a regression with actual assisted field goal data from 2002-03 from 82games.com.
Through this point, my offensive ratings are virtually identical to what John Hollinger calculates as Offensive Percentage in his Pro Basketball Prospectus series. The leaders in this, however, are annually players like Reggie Miller (who did lead the league last season). Well, these players are very efficient offensively, but they're not the best offensive players in the league, because they don't have that big of an impact on their team.
The first step I take to rectify this situation is put the player in the team context by creating a team rating that is the player's percentage of his team's possessions while he's in the game multiplied by his offensive rating plus the rest of the possessions multiplied by league average offensive efficiency.
Still, this is not enough to account for the fact that opposing defenses don't have to expend as much energy against non-scorers as they do go-to players. So I alter the league average by .25 points for each percent of possessions used above or below 20% (the inherent league average). In practice, this means that Tracy McGrady's teammates are rated at 92.6 points per 100 possessions, Ben Wallace's at 88.0.
Is this arbitrary? Absolutely. But it works well in practice, and I think it's the only fair way to appropriately evaluate a player's role in his team's offense.
Rebounding is primarily evaluated by the percentage of available rebounds a player grabs on offense and defense, with a pair of caveats. One is an adjustment that means I don't literally add four average rebounders to the player. The reason for this is that when a team adds a good rebounder, some of his extra rebounds will come away from his teammates.
Is this appropriate? Let's compare the on-court and off-court statistics of Ben Wallace this season. Despite Wallace being a great rebounder, he only makes the Pistons 1.5% better on the offensive glass and a paltry 0.3% on the defensive boards. If you take out Wallace's own rebounds and pro-rate the other four players to five, the Pistons go from 32.1% to 26.5% on the offensive glass, 68.3% to just 53.9% on the defensive boards.
I also make a positional adjustment for offensive rebounds. Ideally, I'd do this in all categories, because the additional player is theoretically added to average players at each other position, not four completely average players. However, it's very difficult, and doing it for offensive boards seems to be sufficient to overcome the bias towards big men I mentioned earlier.
Defense is and has always been the shortcoming of this system. To create a defensive rating, one has to use some team defensive statistics, which can be dicey.
"I have seen statistical ratings that work around this by assigning a 'team defense' rating to each player," Hollinger wrote in the first Pro Basketball Prospectus. "That approach is incredibly crude; giving as much credit to Keith Van Horn as to Jason Kidd for the Nets' defensive strength just doesn't make any sense. . . . The reason the PER does not consider position defense is because the alternative would be to assign a made-up rating for defense that has no basis in reality."
Harsh, but probably far. At the same time, I would respond that to give defense less importance, as linear weights formulas do, is equally escaping reality, even if not made-up. At the same time, giving players complete credit for their team's performance definitely is silly. At most, a player is only responsible for 20% of his team's defensive effort, less if he isn't a true iron man who plays all his team's minutes.
What I do, then, is create what I call the "Team Defense Factor", which is the player's minutes divided by his team's minutes. The player's team's rating in "team" areas of defensive -- forcing missed shots and forcing non-steal turnovers -- is found by the player's Team Defense Factor multiplied by his team's rating plus (1 - Team Defense Factor) times the league average.
The other three elements of defense -- fouling and sending players to the free-throw line, blocking shots, and stealing the ball -- are rated using solely the player's own performance and his four league-average defense.
The way I put this together is to estimate how many of the opposing team's possessions would end in steals, how many in blocked shots, how many in non-steal turnovers, how many in free throws, and how many in field goal attempts. Then I estimate how many points each free-throw attempt and field-goal attempt would result in, the latter based on the combination of the player's team's defense and league average as described above.
Putting it all together means plugging the "team" ratings for offense, defense, offensive rebounding, and defensive rebounding into the regression I produced using team data from 1990 through 2003, which I discussed briefly in this November column.
Beyond this, the other two summary statistics I calculate to take playing time into account are Net Wins -- (Win% - .5)*(Min/48) -- which is essentially how many games "above .500" the theoretical team is; and WARP -- (Win% - .35)*(Min/48) -- which describes by how many games the team improves adding the given player as compared to a replacement-level player.
Naturally, I have to put this new system to the same test -- measuring team performance -- that linear weights fail. This system is designed to succeed, in that the team's performance is captured within player statistics. And succeed it does. If we rate each team (by weighting each player's winning percentage by minutes), the correlation with team winning percentage is .927. Change it to point differential, and the correlation is an even superior .950. This is despite the fact that trades mean many teams have changed personnel mid-season.
If we are to calculate each team's offensive and defensive rating in the same manner as its projected winning percentages, the correlations with actual offensive and defensive ratings are very strong -- .976 and .960, respectively. I may still be making mistakes by giving credit to the wrong players, but not the wrong teams. This is also valuable because a team that only wins 30 games can't have players whose individual ratings add up to 40 wins, for example, which is a good reality check for any system.
On the other hand, I should point out that while I like what the Team Defense Factor does, it also means that the deviation in offensive ratings (3.17 points per 100 possessions standard deviation) is much larger than that for defensive ratings (1.37). That means that good offensive players are rated better than good defensive players and vice versa, and that offensive ratings tend to dominate the overall winning percentage more than they should.
That being said, I think my system also passes the "laugh test".