December 10, 2010

Slugging regression II

Building on my previous post, this time around we'll look at a bigger group of hitters, those with at least 75 at-bats in both 2007 and 2008. This is a total of 360 players.
Theoretically with fewer at-bats, we would see a greater number of very high SLG values and also a larger number of below-average SLG values. But we've already seen hints that player talent gets evaluated early on (in the previous post, I identified the fact that the worst SLG in the 400+ group wasn't as awful to the same degree as the best hitters are good).

How to read the charts below: in both cases, there are 25 players plotted. Those that fall between the 100% and zero lines are regressing to the league mean. And the closer they are to the line, the bigger the regression. As shown in Figure 1, 22 of the top sluggers regressed toward the mean in 2008, 3 improved (led by Albert Pujols) and none fell below the league average.

For these players, 66% of their 2008 SLG score was accounted for by their 2007 SLG (and therefore the league average accounted for 44%).

An interesting observation is that these players are by and large the same as the 400+ AB group I dealt with in the previous post. Of the 25, 19 had 400+ ABs in both years. And of the remaining 6, 4 of the players had below 400 in 2007 and then over 400 in 2008. This group includes familiar names -- Josh Hamilton, David Murphy, and Cody Ross. All of them are young sluggers who did well in a short stint in 2007, and were given the opportunity to continue to play in 2008.

Figure 1: Top 25 SLG (2007), minimum 75 at-bats

For hitters at the bottom of the slugging table, we see a similar pattern of regression. Figure 2 shows SLG "improvement" in the opposite direction: the closer the bar gets to the bottom of the chart, the bigger the improvement. Thus of the 25 players, 17 regressed toward the mean without achieving it, and 3 others exceeded the league average (the ones who fell "below zero"). The remaining 5, on the other hand, started out below average in 2007 and fared worse in 2008.
For this group, the previous year's SLG accounted for only 55% of their 2008 SLG.
And for this group of 25, they are decidedly not the same players as the least sluggerly of the 400+ AB group. Only 1 -- Jason Kendall -- appears in both lists.

In short, the "survivor bias" that keeps good players active with opportunities to hit has an inverse impact on the players at the bottom of the stack. Unless they show a huge improvement that brings them much closer to the league average, these players seem to be destined to part-time roles.

Figure 2: Bottom 25 SLG (2007), minimum 75 at-bats

This analysis is best described as "proof of concept". Certainly any conclusions drawn should be tentatively stated, and a more robust analysis over a greater number of seasons is warranted.


No comments:

Post a Comment