Monday, November 28, 2016

Statistical Meanderings, 2016

What follows is an abbreviated version of my annual collection of oddities that jump out at me from the year-end statistical reports I publish on this blog. These tidbits are intended as curiosities rather than as sober sabermetric analysis:

* The top ten teams in MLB in W% were the playoff participants. The top six were the division winners. A rare case in which obvious inequities aren't created by micro-divisions, in stark constant to 2015's NL Central debacle.

* In the NL, only Washington (.586) had a better overall W% than Chicago's road W% (.575). Of course, the Cubs were a truly great team, and with 103 wins and a world title on the heels of 97 wins a year ago, they belong in any discussion of the greatest teams of all-time. In Baseball Dynasties, Eddie Epstein and Rob Neyer used three years as their base time period for ranking the greatest dynasties. Another comparable regular season in 2017, regardless of playoff result, would in my opinion place the Cubs forwardly on a similarly-premised list.

Most impressive about the Cubs is that despite winning 103, their EW% (.667) and PW% (.660) outpaced their actual W% of .640.

* It is an annual tradition to run a chart in this space that compares the offensive and defensive runs above average for each of the playoff teams. RAA is figured very simply here by comparing park adjusted runs or runs allowed per game to the league average. Often I enjoy showing that the playoff teams were stronger offensively than defensively, but that was not the case in 2016:



This is another way to show just how great the Cubs were--only two other playoff teams were as many as 80 RAA on either side of the scorecard and the Cubs were +101 offensively and +153 defensively.

* The Twins have a multi-year run of horrible starting pitching, and 2016 only added to the misery. Only the Angels managed a worse eRA from their starters (5.61 to 5.58); only A's starters logged fewer innings per start among AL teams (5.39 to 5.40); and the Twins were dead last in the majors in QS% (36%). In their surprising contention blip of 2015, the Twins were only in the bottom third of the AL in starting pitching performance, but in 2014 they were last in the majors in eRA, second-last in IP/S (ahead of only Colorado and QS%; in 2013 they were last in all three categories; and in 2012 they were last in the majors in eRA and second-last in IP/S and QS%.

* There were a lot of great things from my perspective about the 2016 season from a team performance perspective, chiefly the Indians winning the pennant and playoffs in which the lesser participants did not advance their way through. Both were helped along by the comeuppance finally delivered to the Royals. It wasn't quite as glorious as it might have been, as they still managed to scrap out a .500 record, but the fundamental problems with their vaunted contact offense were laid bare. KC was easily the lowest scoring team in the AL at 4.05 R/G, with the Yankees of all teams second-worst with 4.19. They were last in the majors with .075 walks/at bat (COL, .084 was second worst). They were last in the AL in isolated power by 12 points (.137) and beat out only Atlanta and Miami, edging out the 30th-ranked Braves by just .007 points. Combining those two, their .212 secondary average was sixteen points lower than the Marlins for last in the majors. But they were at the AL average in batting average at .257, so that's something.

* Andrew Miller averaged 17.1 strikeouts and 1.3 walks per 37.2 plate appearances (I use the league average of PA/G for to rest K and W rate per PA on the familiar scale of per nine innings while still using the proper denominator of PA). If you halve his K rate and double his walk rate, that's 8.6 and 2.6, which is still a pretty solid reliever. A comparable but slightly inferior performer this year was Tony Watson (8.2 and 2.8).

* Boston's bullpen was built (or at least considered by some preseason) to be a lockdown unit with Tazawa, Uehara, and Kimbrel. Tazawa had a poor season with 0 RAR; Uehara and Kimbrel missed some time with injuries and were just okay when they pitched for 10 RAR each. Combined they had 20 RAR. Dan Otero, a non-roster invitee to spring training with Cleveland, had 26 RAR.

* Matt Albers (-18) had the lowest RAR of anyone who qualified for any of my individual stat reports. I don't think that save is very likely at this point.

* Just using your impression of Toronto's starters, their talent/stuff/age/etc., just try to associate each to their strikeout and walk rates (the five pitchers are RA Dickey, Marco Estrada, JA Happ, Aaron Sanchez, and Marcus Stroman):



The correct answer from A to E is Dickey, Sanchez, Stroman, Estrada, Happ. I never got a chance to play this game without being spoiled, but I'm certain that I would have at least said that Aaron Sanchez was pitcher D.

* Jameson Taillon made it to the majors at age 25, and the thing that jumped out at me from his stat line was his very low walk rate (1.5, lower than any NL starter with 15 starts save Clayton Kershaw and Bartolo Colon. note that Taillon just cleared the bar for inclusion).

John Lackey, at age 38, chipped in 49 RAR to Chicago (granted, fielding support contributed to his performance). Taillon and Lackey are always linked in my head thanks to a Fangraphs prospect post from several years ago that I will endeavor to find. I believe the Fangraphs writer offered Lackey as a comp for Taillon. A commenter, perhaps a Pittsburgh partisan, responded by saying it was a ridiculous comparison, essentially an insult to Taillon.

My thought at the time was that if I had any pitching prospect in the minors, and you told me that if I signed on the dotted line he would wind up having John Lackey's career, I would take it every time. That's not to say that there aren't pitchers in the minors who won't exceed Lackey's career, but to think that it's less than the median likely outcome for any pitching prospect is pretty aggressive. And this was before Lackey's late career performance which has further bolstered his standing. What odds would you place now on Jameson Taillon having a better career than John Lackey?

* Jeff Francoeur had exactly 0 RAR. Ryan Howard had 1, before fielding/baserunning which would push him negative.

* I mentioned in my MVP post how unique it was that Kyle and Corey Seager were both worthy of being on the MVP ballot. They performed fairly comparably across the board:



Chase and Travis d'Arnaud also had pretty similar numbers. Not good numbers, but similar nonetheless (which in Chase's case was probably a triumph whilst a disappointment for Travis):



* It wouldn't be a meanderings post without some Indians-specific comments. It has actually been harder than usual to move on to writing the year-end posts because of the disappointment of seeing the Indians lose their second, third, and fourth-consecutive games with a chance to close out the World Series. Three of those losses have come by one run and two in Game 7 in extra innings. The Indians have now gone 68 seasons without winning the World Series, losing four consecutive World Series after winning the first two in franchise history. That now matches the record of the Red Sox from 1918 - 1986, which if Ken Burns' "Baseball" and plagiarist/self-proclaimed patron saint of sad sack franchises Doris Kearns Goodwin are to believed was a level of baseball fan suffering unmatched and possibly comparable to the Battle of Stalingrad. Well, except for the initial two World Series winning streak--Boston won their first four World Series.

The two Cleveland notes I have are negative, which is only because I have been thinking about them in conjunction with Game 7. One is how bad Yan Gomes was this season, creating just 1.9 runs per game over 262 PA, dead last in the AL among players with 250 or more PA. I did not understand Terry Francona's decision to pinch-run for Roberto Perez with the Indians down multiple runs in the seventh inning. He must have felt that a basestealing threat would distract Jon Lester, but given the inning and the extent of Cleveland's deficit, it basically ensured that Gomes would have to bat at some point. And bat he did, with the go-ahead run on first and two outs in the eighth against a laboring Chapman who had just coughed up the lead.

Also costly was the decision to bring Michael Martinez in to play outfield in the ninth. That move made more sense given Coco Crisp's noodle arm, but to see Martinez make the last out was a tough pill to swallow (and had Martinez somehow reached base, Gomes would have followed). And don't even get me started on the intentional walks in the tenth inning.

Also, it must be noted that Mike Napoli, who struggled in the postseason, was a very average performer in the regular season, creating 5.2 runs per game as first baseman. This is not intended as a criticism of Napoli, especially since I have been kvetching for years about the Indians inability to get even average production out of the corners. Napoli fit that need perfectly. But it felt as if the fans and media evaluated his performance as better than that (even limited strictly to production in the batter's box and not alleged leadership/veteran presence/etc.)

* For various reasons, a few of the players who were in the thick of the NL MVP race a year ago and were surely considered favorites coming into this season had disappointing seasons. These three outfielders (Bryce Harper, Andrew McCutchen, Giancarlo Stanton) all wound up fairly close in 2016 RAR (28, 27, 23 respectively), yielding the MVP center stage to youngsters (Kris Bryant and Corey Seager), first basemen (Freddie Freeman, Anthony Rizzo, Joey Votto) and a guy having a career year (Daniel Murphy).

More interestingly, those big three outfielders combined for 78 RAR--five fewer than Mike Trout.

Wednesday, November 16, 2016

Hypothetical Ballot: Cy Young

There are no particular standout candidates for the Cy Young in either league, and I was tempted to open up this post by saying something like “Maybe it is a harbinger of things to come, as starting pitchers workloads continue to decrease and more managers consider times through the order in making the decision to go to the bullpen…we can expect more seasons like this, where no Cy Young contender really distinguishes himself.”

And then I stopped and concluded, “You idiot, don’t you dare write that.” This is exactly the kind of banal over-extrapolation of heavily selected data that I rail against constantly. In the long run, is it possible that those factors could contribute to a dilution of clear Cy Young candidates, leaving voters to comb over a pack of indistinguishable guys pitching 180 innings a year? Entirely possible. Does that make 2016 the new normal? Of course not. Just last year, there was an epic three-way NL Cy Young race. This year, only an injury to Clayton Kershaw seems to have stood in the way of a historic season and Cy Young landslide.

In the AL race, Justin Verlander had a 70 to 61 RAR lead over Chris Sale, with a pack of pitchers right behind them (Rick Porcello 59, Corey Kluber 58, Jose Quintana 57, Aaron Sanchez/JA Happ/Masahiro Tanaka 56). Convieniently, the first four in RAR also are the only pitchers who would also have 50 or more RAR based on eRA or dRA, with one exception. Verlander allowed a BABIP of just .261 and would so his dRA is 3.80, significantly higher than his 3.04 RRA. However, none of the others look better using dRA--all three are five to eight runs worse. So I go with Verlander for the top spot and Porcello second over Sale (he led the AL with a 3.14 eRA, and since we are talking about one run differences here, Bill James would at least want us to consider his 22-4 W-L record). I didn’t actually consider the W-L record, but he does rank just ahead of Sale if you weight RAR from actual/eRA/dRA at 50%/30%/20%, which has no scientific basis but seems reasonable enough. Again, there’s only a one RAR difference between Sale and Porcello, so using W-L or flipping a coin to order them is just as reasonable. I gave the fifth spot to Jose Quintana over Aaron Sanchez, and would not have guessed that Quintana had a better strikeout rate (8.1 to 7.8).

This leaves out Zach Britton, who I credit with just 35 RAR. I remain thoroughly unconvinced that leverage bonuses are appropriate. Each run allowed and out recorded is worth the same to the final outcome regardless of what inning it comes in. The difference between starters and relief aces is that some of the games the former pitch could have been won or lost with worse or better performances, while relief aces generally are limited to pitching in close games. But the fact that Britton pitches the ninth doesn’t make his shutout inning any more valuable than the one Chris Tillman pitched in the fourth within the context of that single game. To the extent that Britton contributes more value on a per inning basis, it’s because he pitched in a greater proportion of games in which one run might have made a difference, not because that is more apparent for any particular game at the point at which Britton appears in it than it was when the starter was pitching. I have alluded to this viewpoint many times, but have never written it up satisfactorily because I’ve not figured out how to propose a leverage adjustment that captures it, without going to the extreme that value can only be generated by pitching in games your team wins.

1. Justin Verlander, DET
2. Rick Porcello, BOS
3. Chris Sale, CHA
4. Corey Kluber, CLE
5. Jose Quintana, CHA

In the NL, there were seven starters with 60 RAR and then a gap of four to Jake Arrieta, which makes a good cohort to consider for the ballot. Of this group, Tanner Roark and Madison Bumgarner at the bottom in terms of RAR and had high dRAs (4.17 and 3.87) which justify dropping them.

That leaves Jon Lester (71 RAR), Kyle Hendricks (70), Max Scherzer (70), Johnny Cueto (65), and Clayton Kershaw (64). If you weight 50/30/20 as for the AL, all five are clustered between 60 and 64 RAR. This makes it tempting to just to pick Kershaw as he was much the best in every rate and narrowly missed leading the league in RAA despite pitching only 149 innings.

Among the four who pitched full seasons, Scherzer ranks first in innings and third in RRA, eRA, and dRA. However, he pitched significantly more innings than the Cubs candidates--25 more than Lester and 38 more than Hendricks. Comparing him to Cueto, who pitched nine fewer innings, Scherzer leads in RRA by .09 runs, eRA by .13 runs, and trails in dRA by .09 runs. So for my money Scherzer provided the best mix of effectiveness and durability.

All that’s left is a direct comparison of Scherzer to Kershaw, in which I think the innings gap is just too great without giving excessive weight to peripherals. The difference between Scherzer and Kershaw is 79 innings with a 3.62 RRA. To put it in 2016 performance terms, that makes Scherzer equivalent to Kershaw plus a solid reliever like Felipe Rivero or Travis Wood. That’s too much value for me to ignore looking at the gaudy (and they are gaudy!) rate stats:

1. Max Scherzer, WAS
2. Jon Lester, CHN
3. Kyle Hendricks, CHN
4. Clayton Kershaw, LA
5. Johnny Cueto, SF

Hypothetical Ballot: MVP

You could basically copy and paste the same thing for AL MVP every year, so I’ll try to keep it brief. My position is that wins are value, and 8 wins don’t count for more because the rest of your teammates were worth 50 than if the rest of your teammates were only worth 30.

But the debate over the definition of value is not what I find most obnoxious about the Mike Trout-era MVP discussions. It’s easy enough to disagree on that point and move one. What is most bothersome is the way that people attempt to co-opt the sabermetric terms that sound sabermetric like “error bars” to push their own narratives.

Let’s suppose that Player A is estimated to have contributed 87 RAR and player B is estimated to have contributed 80 RAR, and that the standard error is something like 10 runs. In this case, it certainly is inconclusive that player A was truly more valuable than player B. I would grant that player B would be a reasonable choice as MVP.

But if you’re filing out your MVP ballot, *should* you put Player B ahead of Player A? It’s still quite likely that Player A was more valuable than Player B. To me, you need to have a good reason to put Player B ahead, particularly when the margin is “significant” but not beyond the “error bar”.

Worse yet, though, is the attempt to twist oneself into a pretzel to make up those good reasons. The real gem going around, which you will see in comment sections and message boards, is that the error bars must be larger for Player A. Because you see, Player A’s park became a strong pitcher’s park right around when he arrived, and parks don’t change character like that (says someone who has never examined historical park factors). Because you see, Player A always leads the league in RAR, and by a wide margin--that just can’t be right. Player A is so consistently great in the metrics that the metrics must be wrong.

The world is not worthy of Player A. Every week of Player A’s career is scrutinized by pseudo-sabermetricians who have deadlines to fill with their micro-analytical pablum, and who when they aren’t vulturing over Player A are busy writing extrapolating trends from blips in thirty-team samples to blame metrics for their own arrogance. Player A can’t win with the people who should be appreciating him--not in the sense that a fan might but exactly in the sense that a detached analyst would.

I’m sure you’ve deduced by now that Player A is Mike Trout, and you may have guessed that Player B is Mookie Betts. Except those aren’t even my true estimates of their RAR, they’re what I would come up with their RAR if I took my hitting/position RAR + BP’s baserunning runs (for non-steals, since steals are incorporated in the first piece) + the average of each player’s BP FRAA, BIS DRS, and MGL UZR. In other words, if I didn’t regress fielding at all, which I don’t think is the correct position. When adding components together, if one (hitting) is more reliable than another (fielding), it doesn’t make sense to ignore that. In actually estimating RAR for the purpose of filling out a fake MVP ballot, I used 50% FRAA, 25% DRS, 25% UZR, and halved it. Then Trout is at 86 RAR, Betts 68, and Jose Alutve slides in between them at 71, which explains the top of my ballot.

If anything, I think I may be generous to Betts, who needs all of his 8 baserunning runs and 11 “regressed” fielding runs to overcome 49 hitting RAR, which ranked just ninth in the league. Kyle Seager also made it onto my ballot on the strength of 8 fielding runs, and Francisco Lindor came close with 5 from baserunning and 10 from fielding. David Ortiz and Miguel Cabrera gave up 5 runs from non-hitting activities (or in Ortiz’s case, non-acitivty), which pushed them just off the ballot. Last year’s Player B, Josh Donaldson, was only a hair behind Betts, having another excellent season with 65 RAR and good-average fielding except in FRAA, which didn’t like his performance at all (-12).

The AL starting pitchers lacked any standout Cy Young candidates, but made up for it by being tightly bunched, so four of the final six spots go to them:

1. CF Mike Trout, LAA
2. 2B Jose Altuve, HOU
3. RF Mookie Betts, BOS
4. 3B Josh Donaldson, TOR
5. SP Justin Verlander, DET
6. 2B Robinson Cano, NYA
7. SP Rick Porcello, BOS
8. SP Chris Sale, CHA
9. SP Corey Kluber, CLE
10. 3B Kyle Seager, SEA

In the NL, I think Kris Bryant is a pretty clear pick for the top spot. He was second in the league in RAR by just one run to Joey Votto, which he makes up with baserunning alone and pads with strong fielding runs (2, 10, 12). Anthony Rizzo seems to be the other top candidate in mainstream opinion, but he only ranks third among first baseman on my ballot. Rizzo, Freddie Freeman, and Joey Votto all had similar playing time, but both significantly outhit him (Rizzo 6.9 RG, Votto 8.2, Freeman 7.6). Rizzo makes up much of the ground on Votto with his glove, but Freeman is no slouch himself.

Corey Seager got mixed reviews as a fielder (-8, 0, 11) so he falls just behind Freeman on my ballot. I’m quite certain I’ve never had brothers on both of my MVP top 10s in the same year, or any year. Daniel Murphy was third to Votto and Bryant in RAR, but his fielding reviews aren’t so mixed (-5, -11, -6), and even before considering that was actually just behind Max Scherzer in RAR. From there, it’s just a matter of mixing in the pitchers and noting that four Cubs are on the ballot:

1. 3B Kris Bryant, CHN
2. 1B Freddie Freeman, ATL
3. SS Corey Seager, LA
4. SP Max Scherzer, WAS
5. 2B Daniel Murphy, WAS
6. SP Jon Lester, CHN
7. 1B Joey Votto, CIN
8. 1B Anthony Rizzo, CHN
9. SP Kyle Hendicks, CHN
10. SP Clayton Kershaw, LA

Wednesday, November 09, 2016

Hypothetical Ballot: Rookie of the Year

It was a bad year for rookies in the AL, made more interesting by the very late arrival of Gary Sanchez. Most of the discussion about the award seems to center around whether it is appropriate to give it to Sanchez based on his brilliant 227 PA, and whether ROY should be a value award, a future prospect award, or some kind of ungodly hybrid of the two. My own approach is that it should be a value award--anyone who is a rookie should be eligible and my primary criteria is how productive they were in 2016, not how old they are, their prospect pedigree, how their team held down their service time, or the like. Only in a very close decision would I factor in those criteria. I understand why others might consider those factors, and why it makes a lot more sense to deviate from a value approach for ROY than for Cy Young or MVP.

As such, I don’t consider Sanchez’s case to be particularly compelling. Yes, Sanchez was more productive on a rate basis than any AL hitter other than Mike Trout. Yes, the lack of a standout candidate in the rest of the league makes Sanchez all the more appealing. But Sanchez’s performance far outpaced both his prospect status and his minor league numbers (807 OPS in 313 PA at AAA this year, 815 across AA and AAA last year). If I was going to consider a shooting star exception, it would be for someone who checked all the boxes. I would much rather have Sanchez’s future than any of the other four players on my ballot, but in 2016 he fell in the middle in terms of value.

With Sanchez out, the top of the ballot comes down to Michael Fulmer, who is the top non-Sanchez candidate in the popular discussion, and Chris Devenski. I watched a game in which Devenski pitched this year and was vaguely aware of his existence in subsequent box scores, but how effectively he was pitching completely escaped my attention until I put together my annual stat reports. Devenski pitched extremely well for Houston, mostly in relief (48 games, 5 starts) with a 1.80 RRA over 108 innings. His peripherals were strong as well (2.39 eRA and 2.79 dRA).

Fulmer pitched 159 innings with a 3.41 RRA for 42 RAR versus Devenski’s 39. Fulmer’s peripherals were also reasonably strong (3.46 eRA, 4.02 dRA), and since this was a curious case I also checked Baseball Prospectus’ DRA, which attempts to normalize for any number of relevant variables (park, umpires, defensive support, framing, quality of opposition, etc.). Using DRA, Fulmer has a clear edge considering his quantity advantage (3.49 to 3.72).

One thing my RAR figures oversimplify is pitcher’s roles--it is a binary reliever (with replacement level at 111% of league average) or start (replacement level 128% of league average). If I figured RAR using Devenski’s inning split to set his replacement level (83 innings in relief to 24 starting works out to 115% of league as the replacement level), his RAR would edge up to 41. It should be noted too that Devenski pitched decently in his five starts, averaging just under 5 innings with a 4.01 RA.

I think the two are very close; this is a case where Fulmer’s status as a starter and a younger, better regarded prospect leave him just ahead for me. Even so, I assume Devenski will rank higher on my ballot than almost any submitted even for the IBAs.

Filling out the bottom of the ballot, the only other legitimate hitting candidate, Tyler Naquin and his 26 RAR, was heavily platooned and fares poorly in defensive metrics. That leaves two A’s pitchers, one a starter and one a reliever. If I strictly followed RAR, I would actually have the latter (Ryan Dull) ahead of the former (Sean Manaea), and the peripherals don’t really help either’s case, but since they were so close I will vote here for prospect status.

1. SP Michael Fulmer, DET
2. RP Chris Devenski, HOU
3. C Gary Sanchez, NYA
4. SP Sean Manaea, OAK
5. RP Ryan Dull, OAK

The top of the NL ballot is easy, as Corey Seager is a legitimate MVP candidate and far outshines the rest of the rookies. There is a cluster of qualified candidates in the 30-40 RAR range who make up the rest of my ballot. Kenta Maeda gets the nod over Junior Guerra as top pitcher based on stronger peripherals, with apologies to Zach Davies, Tyler Anderson, and Steven Matz. Among hitters, Aledmys Diaz led in RAR with 37 to Trea Turner’s 34, but Diaz’s fielding metrics are bad (-9 FRAA, -3 DRS, -8 UZR) while Turner’s are…not as bad (-3, -2, -5). Both are credited with baserunning value beyond their steals by BP (2 runs for Diaz, 4 for Turner); when you add it up it’s very close, but I consider Turner’s age and the fact that he did it in 130 PA to put him ahead:

1. SS Corey Seager, LA
2. SP Kenta Maeda, LA
3. SP Junior Guerra, MIL
4. CF Trea Turner, WAS
5. SS Aledmys Diaz, STL