Tuesday, December 26, 2006

Leadoff Hitters, 2006

Last year I did a piece ranking the leadoff performances of each teams in a number of categories. I will do the same this year, although without a lot of the comments about each method and how it is calculated. I’ll refer you to last year’s post for that.

In brief, though, I don’t believe that this is a particularly useful activity--for the large part, hitters are hitters, regardless of what slot in the order they bat. Leadoff is probably the most important role, but in general, the best leadoff hitter and the best hitter period would be the same guy. That does not of course mean that leadoff is necessarily the best possible slot for the best hitter, but in general, too much is made of lineup construction among traditional folks anyway. Nevertheless, it is an interesting exercise if not particularly enlightening.

The data comes from the Baseball Direct Scoreboard and is for the team’s #1 slot hitters as a whole. I have listed in parentheses the guys who had the most games played while batting in the #1 spot, which sometimes is less then half of the team’s games. The “ML average” listed in the table is for ML leadoff hitters, not the entire league as a whole. I’ll sometimes discuss the league total in my comments.

The first category I’ll look at is good old runs scored, per 25.5 outs:
1. CLE(Sizemore), 7.3
2. NYA(Damon), 6.7
3. NYN(Reyes), 6.6
ML Average, 5.5
28. ARI(Counsell), 4.8
29. CIN(Freel), 4.6
30. CHN(Pierre), 4.2
Johnny Damon’s Red Sox were number one a year ago, and his Yankees are #2 this go around. Of course, runs scored are heavily influenced by the succeeding batters, and it’s little surprise three of the game’s best offenses are represented in the top 3 spots here. Juan Pierre was seen as a leadoff solution for the Cubs, but as I pointed out last year, this was dubious as he was coming off a year in Florida in which the Marlins were in many of the trailer categories.

On Base Average is an obvious criteria to look at:
1. CLE(Sizemore), .369
2. LA(Furcal), .366
3. SEA(Suzuki), .365
ML Average, .339
28. ARI(Counsell), .301
29. MIL(Weeks), .300
30. PIT(Duffy), .298
The average for all players was .333, so the leadoff advantage is only six points; last year it was ten. The Yankees are fourth on the list, so Damon did his job, although none of the OBAs are eye-popping for an individual player.

Runners On Base Average removes HR and CS from OBA, leaving it not as a pure measure of skill but as an accounting for the percentage of PA in which the leadoff men sets the table by remaining on base:
1. SEA(Suzuki), .351
2. LA(Furcal), .330
3. OAK(Kendall), .327
ML Average, .305
28. WAS(Soriano), .271
29. ARI(Counsell), .267
30. MIL(Weeks), .265
The average for all hitters was .293, making a larger leadoff/overall gap in ROBA then in OBA. Last year, when the opposite was true, I presumed it was because of the high number of caught stealings racked up by leadoff hitters. Washington is near the bottom here because of Soriano’s 40 HR season (they ranked ninth in OBA). The usual suspects, Cleveland and New York, come in at seventh (.321) and eleventh (.318) respectively.

Run Element Ratio from Bill James is not a skill or production measure at all. It is a ratio between offensive elements ideally placed at the beginning of an inning to set it up (walks and steals) versus those ideally placed at the end to clean it up (extra bases):
1. LAA(Figgins), 2.1
2. OAK(Kendall), 1.8
3. MIN(Castillo), 1.7
ML Average, 1.0
28. CLE(Sizemore), .6
29. TOR(Johnson), .6
30. TEX(Matthews), .6
The overall average is .7, and only five teams were below that with their leadoff hitters (the three above as well as Kansas City and Tampa Bay). Sizemore’s power again put Cleveland in the bottom three, while Texas is last for the second year in a row despite changing their primary leadoff hitter from David Dellucci to Gary Matthews. Since those two are now in Cleveland and Los Angeles, we’ll see if they can do it again with a new man in 2007.

Another Bill James tool was his own method for evaluating leadoff hitters, which I call Leadoff Efficiency. This is the number of expected runs scored per 25.5 outs, which is a (relatively) pure of the leadoff man, unlike the actual runs scored figures we looked at first:
1. CLE(Sizemore), 7.1
2. NYN(Reyes), 6.6
3. WAS(Soriano), 6.6
ML Average, 5.6
28. STL(Eckstein), 4.8
29. ARI(Counsell), 4.7
30. MIL(Weeks), 4.6
Damon is again just off the list, fourth at 6.5. Last year the leadoff efficiency formula overestimated actual runs scored for leadoff hitters by a fairly big margin, but this year, the actual was 5.49 and the expected 5.55, not bad at all. Scott Podsednik is fourth to last and Chone Figgins sixth to last.

One can always just look at a leadoff hitter just like we would any other. So here is the list by good old Runs Created per Game:
1. CLE(Sizemore), 6.9
2. NYN(Reyes), 6.4
3. WAS(Soriano), 6.3
ML Average, 4.9
28. LAA(Figgins), 3.9
29. ARI(Counsell), 3.9
30. MIL(Weeks), 3.7
The average for all hitters was 5.0, so once again the average leadoff hitter was worse then the average hitter. Damon is fourth at 6.2

Last year I included what I called Pure Leadoff RAA. Basically, it is the linear weight RAA total a player would get if he always batted with nobody on base and nobody out (the ideal leadoff situation). I based it off of Pete Palmer’s Run Expectancy table from The Hidden Game for simplicity’s sake, which means it is not fully adapted to the run environment of today’s game, but the values should not be too far off. One assumption that the formula makes that I did not mention last year is it assumes that all stolen base attempts occur during the next batter’s PA (or in other words, in a runner at first, no out situation). Here are the figures in this category:
1. CLE(Sizemore), +37
2. NYN(Reyes), +32
3. WAS(Soriano), +29
ML Average, +1
28. ARI(Counsell), -9
29. STL(Eckstein), -10
30. MIL(Weeks), -12
NYA is again fourth at +27. The top three are the same as the RG list for overall hitting, with Soriano’s homers only worth 1 run instead of 1.46 there.

Last year, David Smyth suggested that I look at a modified OPS, 2*OBA + SLG, which I will call 2OPS. Since the optimal OPS construction is something like 1.7 or 1.8*OBA + SLG, using 2 is a way to give a bit more credit to the on base side of things while still having a decent overall measure of production. Since the OPS units are meaningless anyway, I scaled these back so that the league 2OPS ~ league OPS. So these figures are for (2*OBA + SLG)*.7:
1. CLE(Sizemore), 893
2. WAS(Soriano), 864
3. TEX(Matthews), 847
ML Average, 767
28. ARI(Counsell), 686
29. PIT(Duffy), 680
30. MIL(Weeks), 677

Although he may not fit the ideal prototype of a leadoff hitter, Grady Sizemore still comes out on top in most categories as the top leadoff hitter in the game in 2006, with Jose Reyes, Alfonso Soriano, and Johnny Damon close behind in many categories.

Friday, December 22, 2006

Historical Hitting by Postion

I’m excessively fond of chiding the National League as the “Neanderthal League” at every available opportunity for their refusal to use the designated hitter. I don’t wish to discuss the DH here per se, but use one argument against it that you’ll occasionally see as a springboard for discussion.

Sometimes the anti-DH argument will include the rhetorical question, “Why stop at pitcher? Why not have a defensive shortstop and a DH for him, or a defensive catcher and a DH for him.” While it is true of course that you could put together a better offense by completely ignoring defensive ability, there is absolutely no comparison between the performance of shortstops relative to the population of hitters at large and that of pitchers. In order to believe that the circumstances would become such that there would be popular support for a similar shortstop or catcher DH, one must assume I would think that shortstops, like pitchers, have progressively seen their offensive levels decline relative to hitters as a whole.

And this is a useful point for a brief discussion of offensive positional adjustments throughout the decades. While this question is just one part of a tangled web of questions dealing with how to value players at different positions, I’m not going to discuss that issue but just the historical facts.

On my website, there is a chart showing offensive PADJs broken down by into the ten decades from 1900-1998. There are a couple issues with this chart; first, it considers a player only at the position he is listed at first in Total Baseball. In other words, the position in which he played the most games in a given season is his position. If a player appeared in 25 games as an outfielder and 24 as a first baseman, he is 100% an outfielder. Secondly, it does not account for the three outfield positions, but lumps them altogether. And thirdly, it uses the flawed model of basic Runs Created to evaluate each player’s offense.

With the exception of problem two, in which we lose valuable data on the breakdown between outfield positions, I don’t believe that the other two flaws are particularly consequential when dealing with a large group of aggregated players.

Looking at the chart, one of the most interesting things is that third baseman were worse hitters then second baseman in the 1900-1929 period. It is not until the thirties that third baseman hit better then second baseman. This phenomenon has been noted by other analysts, notably Bill James in Win Shares; I’m just pointing out that this data agrees with the earlier conclusions (which of course it should since the other studies were constructed similarly).

Getting to the issue of offensive balance between the positions and whether or not it has declined historically, if you look at the field non-pitching fielding positions (here catcher, first, second, third, short, and outfield), you will see that the standard deviation of position adjustment was higher in the early days then it is today:
1900: .154, 1910: .143, 1920: .156, 1930: .156, 1940: .132, 1950: .122, 1960: .151, 1970: .159, 1980: .134, 1990: .135, 1900-1998: .134

“1900” means the ten-year period starting in 1900 (1900-1909), and so on. In fact, the highest standard deviation came in the 1970s when the DH was adopted. In the 70s, shortstops hit at just 77% of the league average (only aught catchers hit worse, 76%). But this is still a far cry from pitchers’ best showing, 45% in the aughts and the twenties. Pitchers showed a pattern of steady decline to as low as 30% in the 60s and 70s when the DH came of age and 26% and 27% in the eighties and nineties.

The best offensive showing by any position is the first sackers of the 1930s, 129%. In an eight team universe, Lou Gehrig, Jimmie Foxx, and later Hank Greenberg and Johnny Mize are bound to wreck some havoc on the overall figures.

Anyway, you can pursue the chart yourself if you wish for other interesting things. The main point I wanted to make is that at no time in twentieth century major league history has the balance of offensive production between the positions been greater then in the 1980s and 1990s. While the chart does not include 1999-2006, I do not believe that the trend would be significantly different.

It is possible I suppose that the DH itself has had some impact on this. DHs would probably be stuck at 1B or a corner outfield perch in earlier times, and allowing them their own category would allow defensive specialists to sneak in the field at those positions while maintaining overall offensive output. This could cause the balance between the fielding positions to be greater then it would be in absence of the DH. First base offense has been essentially unchanged since the 70s, but outfielders have dropped a little bit. But even if this effect is significant, I don’t believe that it is significant enough to mask a markedly worsening balance or collapse of short, catcher, or other low-offense positions. In fact, shortstops have bounced back relative to the league hitting as a whole form the aforementioned 77% in the 70s; the composite league comparison compares to all hitters, including DHs.

I don’t think that the historical data shows a significant trend, taking a full century view, towards more or less of a balance. But it clearly does not show the widening balance that would justify concerns about multiple DHs becoming a possibility. And if you think that I’ve wasted your time with some rudimentary stuff and this whole thing was an excuse to get a post up finally and bash the NL some more, you might be on to something.