Content-Length: 31083 | pFad | http://web.archive.org/web/20101212200316/http://www.basketball-reference.com/about/similar.html
Last modified on June 2, 2006.
The similarity scores were calculated using a method that is basically a hybrid of John Hollinger's similarity scores method in Pro Basketball Forecast, Kevin Pelton's work on Hoopsworld.com, and my own ideas. Let me state up front that this is a work in progress, and any helpful comments you may have would be appreciated.
Below is a list of the thirteen categories I used for the similarity scores:
I used a player pool that included all NBA seasons from 1977-78 to the present. Within each season, I found the standardized score (or z score) for each player in each category. League means and standard deviations were calculated in each season for each category using player minutes played as the weights. Although all players were used to determine the league means and standard deviations, similarity scores were only computed using players who played at least 25 percent of possible minutes played in the given season (in most seasons this will be roughly 1000 minutes played).
To obtain the similarity score between two seasons, I did the following:
As an example, let me use LeBron James's 2005-06 season and Tracy McGrady's 2004-05 season. Below are their z scores in each of the 15 categories, the absolute differences in their z scores, and the penalties:
FTA/ Ht MPG Shot% Poss% eFG% FT% FGA AST% TOV% STL% BLK% ORB% DRB% James 0.295 1.644 2.616 2.352 0.553 1.938 -0.018 0.726 -0.800 0.633 0.008 -0.777 0.469 McGrady 0.279 1.549 2.220 2.335 -0.185 1.419 0.195 0.049 -1.159 1.049 -0.145 -0.810 0.089 --------------------------------------------------------------------------------------------------- Abs Diff 0.016 0.095 0.396 0.017 0.738 0.519 0.213 0.677 0.359 0.416 0.153 0.033 0.380 Penalty 0.160 0.950 3.960 0.170 7.380 5.190 2.130 6.770 3.590 4.160 1.530 0.330 3.800
The weighted sum of the penalties is 52.755. The similarity score for these two seasons is 1000 - 52.755 = 947.245 (this figure actually rounds up to 948 when all of the decimal places are carried). When all player seasons are analyzed, the closest match to James's 2005-06 season is in fact McGrady's 2004-05 season.
On the player pages, similarity scores are presented for each qualifying season. The first list shows the most similar season at the same age. In other words, players are only compared to other players at the same age. The second list shows the most similar season regardless of age. In other words, players are compared to all other qualifying players.