Sunday, August 02, 2015

Great Moments in Yahoo! Box Scores

Am I surprised that Yahoo! box scores did not know how to handle SEA surrendering their DH position when Logan Morrison moved to first base? Of course not. It's still funny.

Thursday, July 23, 2015

1883 NL

The NL now had to share the major league stage with the AA, and it began to take act accordingly. Some combination of league “persuasion” and financial difficulty chased Troy and Worcester from the league. In their place, the NL returned to the two cities whose clubs were expelled after 1876, New York and Philadelphia. The two circuits would thus have their first head-to-head city battles.

Abraham Mills became the fourth president of the NL. According to Harold Seymour, one of his actions was to remove the team names from the official letterhead, as they tended to change so often. More significant changes were the expansion of the schedule to 98 games, switching to the Reach ball, eliminating first bound foul outs once and for all, and allowing the umpire to call for a new ball to be put in play at any time. These innovations were not copied by the upstart Association, with the exception of the expanded schedule.

Notable individual feats during the season included Monte Ward, now of New York, becoming the first pitcher to hit two homers in on game on May 3; Hoss Radbourn’s (PRO) 8-0 no-hitter against Cleveland on July 25 and One Arm Daily’s (CLE) 1-0 no-hitter versus hapless Philadelphia on September 13.

On May 30 (Memorial Day or Decoration Day or whatever it was called at the time), the NL tried a couple of two-city doubleheaders; Cleveland lost 3-1 at Boston, then won 5-2 at Providence. Taking their place was Buffalo, which started with a 4-2 loss in Providence and took the broom for the day as they lost 2-1 in Boston.

The pennant race was very competitive, with four teams in the mix. On July 7, Providence led the way at 33-16; Cleveland was two games back, but actually led by one in the loss column (30-15). Boston and defending champ Chicago were running third and fourth respectively. By August 20, Cleveland had the lead at 45-27 with Providence second. However, Cleveland lost two to Chicago and Providence two to Boston, bringing them right back into the hunt. The Reds went on to win six straight, but the Grays ended their streak and added another win in the first two of a series in Rhode Island. The third game on September 8 matched up the aces--Hoss Radbourn for the Grays and Jim Whitney for the Reds. Providence took the lead in the top of the eleventh, but Boston countered with two for a 4-3 win.

From that point on, Boston was in command, taking thirteen of fourteen and wrapping up the flag with a 4-1 victory over Cleveland on September 27. The final margin was four over the White Stockings, five over the Grays, and seven and a half over the Blues.

Perhaps Mills should have ordered some more stationary, as all eight clubs would be back. Peace with the AA came as well, but the tenuous new alliance would be tested immediately.


The NL continued to boast fine competitive balance with the exception of Philadelphia. The actual records were pretty close to what the runs scored and allowed would suggest.


The Reds (who according to Nemec were also being referred to as the Beaneaters around this time, all informally of course) won their third NL pennant, pulling to within one of Chicago for the lead. It was a mild surprise as they were third and ten back in 1882, needing to pass both the White Stockings and their New England rivals, the Grays.

Rookies Mike Hines and Paul Radford were serviceable, while Edgar Smith was also a rookie but really was only the nominal regular in center field. He played thirty games in the outfield but Jim Whitney played forty, being used as a regular when not pitching for the first time. While his ARG was down a bit, the extra playing time made his bat more valuable and he had a better year pitching (his best so far in fact).

Rookie hurler Charlie Buffinton gave Boston a pair of good pitchers, something they had not had since two pitchers had become the norm. The duo’s combined WAR of over ten paced the circuit.

John Morrill was replaced by Jack Burdock as manager mid-season. Meanwhile, Arthur Soden was operating behind the scenes. He had claimed for a few years that the team’s profits needed to be reinvested to improve the club, but after he continued to sing that tune after the pennant, many of the shareholders gave in and sold out.


The White Stockings kept their three-time pennant winning lineup largely intact; they cast off Hugh Nicol, shifted King Kelly back to the outfield, slid Tom Burns over to shortstop, and brought in Fred Pfeffer from Troy to play second.

If the second place finish must be laid at the feet of anyone, the previously brilliant pitching duo of Corcoran and Goldsmith would be the prime culprits. While they were still above average, they were now overshadowed by the Whitney/Buffinton duo in Boston and the Radbourn/replacement level #2 in Providence.

The White Stockings hammered Detroit 26-6 on September 6 on the strength of a record setting 18 runs in the seventh inning.


The Grays continued to close the gap on Chicago; since winning the 1879 flag, they had finished 15 behind Chicago, then nine, three, and finally one. Unfortunately for them, that meant a third place finish in 1883.

The team brought in three regulars from the defunct franchises; Arthur Irwin and Lee Richmond (the team’s #3 pitcher with 94 innings) from Worchester and John Cassidy from Troy. Rookie Cliff Carroll shared left with Richmond, and rookie Charlie Sweeney turned in a replacement-level performance as the #2 pitcher. With Radbourn earning his “Hoss” moniker by tossing 632 innings with the league’s best ARA, that was adequate enough.

There were rumors late in the summer that the club’s board of directors was going to close shop and instead shift their sporting interests to harness racing. Instead, perhaps due to displeasure over the supposed scheme, they voted to distribute the profits, then resigned. Providence would continue as a member of the National League.


The Blues picked up Bushong and Evans from Worcester, Daily from Buffalo, York from Providence, and Hotaling from Boston (he had played for Cleveland in 1880 as a rookie). They managed to get into contention largely on the strength of their brilliant keystone combo of Dunlap and Glasscock, both of whom I see as the all-star at their position for 1881-1883.

Had they combined the pitching they had boasted in previous years with this offense, they may have done more than just contend. Jim McCormick’s workload plummeted from 596 to 342 innings despite the expanded schedule (although his ARA held steady at 86). Neither One Arm Daily nor nineteen year old Will Sawyer, in his only big league campaign, contributed any value.

The Blues visited the White House in April. President Chester Arthur spoke to them, advising that “good ballplayers make good citizens”.


The Bisons finished with the same W% (.536) that they had in 1882. This is not too surprising as they returned all their regulars except Blondie Purcell (now with Philly) and One Arm Daily (Cleveland). However, Curry Foley was sidelined with various maladies of the joints, and only got 115 PA. Rookie Jim Lillie took his place in center field (see Brian McKenna’s thread on Foley at Baseball-Fever for more).

A profit of $5,000 was turned, which they put towards a new ballpark that, according to Phil Lowry in Green Cathedrals, cost $6,000. The team also wore new blue uniforms this season; ace Pud Galvin objected on the grounds that they made him look fat. No, Pud, the fat made you look fat.


The Gothams were controlled by John Day and Jim Mutrie, who also controlled the Metropolitans, the formerly independent and now AA club. Some players were shifted from the Mets to the Gothams, but around half of the New York regulars had played in the NL in 1882.

The top source of players was Troy, from whence came Ewing, Connor, Gillespie, and Welch (Hankinson and Caskin had last seen NL action with the Trojans in 1881). Dasher Troy came from Detroit (Dorgan had last played in the league with the Wolverines in 1881), Monte Ward from Providence, and Tip O’Neill was the lone rookie on the team. Given their wealth of experience for a first-year club, it is not surprising that they finished a respectable 46-50.

John Clapp was the manager of the team; he was also a reserve catcher (78 PA). This would make the end of Clapp’s remarkable major league career in which he managed six different teams all for one season each: the Mansfields of Middletown, CT in 1872; then, for four straight years beginning in 1878, Indianapolis, Buffalo, Cincinnati, and Cleveland; and finally the 1883 New Yorks.

The team’s debut, which was the first NL game ever played in New York City (the 1876 Mutuals actually played in Brooklyn which was a separate municipality at the time), was a 7-5 victory over eventual pennant winner Boston. The crowd included ex-President Ulysses S. Grant.


The Wolverines were mired near the bottom of the heap again; they had slipped to sixth and now seventh after an encouraging fourth in their inaugural campaign. Sam Trott, a reserve in 1882, moved into the lineup, and they regained the services of Sadie Houck who had not played in 1882. Tom Mansell split right field duties with rookie Dick Burns after having not played in the NL since 1879 with Syracuse. Burns and another rookie, Dupee Shaw, teamed up with Stump Wiedman as the primary pitchers.


It was a tough first year for the Phillies. The NL, seeking to get into Philadelphia where the AA’s Athletics had been a success, recruited Al Reach to back the team; he and partner Ben Shibe went in with lawyer John Rogers.

It may not be historically kosher, but I like to think of this club as the first expansion team in the modern sense. Most previous “expansions” had simply brought strong independent clubs into the circuit, but the Phillies included a group of castoffs and minor leaguers that more closely resembles modern expansion team composition. Their pathetic 17-81 record, worst in the NL since the league’s inaugural season (Cincinnati, 9-56) certainly doesn’t hurt the argument. They also used 29 players, a huge number for the time--the next highest was Buffalo with 18 and the median was 16.

Of the regulars, Ringo, Farrar, Coleman, and Hagan were rookies. The rest of the players (and last ML experience; 1882 if not listed): Gross (PRO, 1881), Ferguson (TRO), Warner (CLE, 1879), McClellan (PRO, 1881), Purcell (BUF), Harbidge (TRO), and Manning (BUF, 1881).

On Aguust 21, the Phillies were hammered 28-0 at Providence. Art Hagan was a Rhode Island native, and was left in to absorb the shellacking as the team did not want to disappoint the hometown fans that had come out to see him.

It was not a good year at the box office either, as the Athletics were reported to draw twice as many fans as any NL team and four times as many as the Phililies. Bob Ferguson quit as manager mid-season and was replaced by Blondie Purcell.

Leaders and trailers:
1. Dan Brouthers, BUF (.374)
2. Roger Connor, NYN (.357)
3. George Gore, CHN (.334)
Trailer: Doc Bushong, CLE (.172)
1. Dan Brouthers, BUF (.397)
2. Roger Connor, NYN (.394)
3. George Gore, CHN (.377)
Trailer: Stump Wiedman, DET (.196)
Trailing non-pitcher: Doc Bushong, CLE (.198)
1. Dan Brouthers, BUF (.572)
2. John Morrill, BSN (.525)
3. Roger Connor, NYN (.506)
Trailer: Doc Bushong, CLE (.195)
1. John Morrill, BSN (.243)
2. Charlie Bennett, DET (.240)
3. Dan Brouthers, BUF (.235)
Trailer: Stump Wiedman, DET (.048)
Trailing non-pitcher: Doc Bushong, CLE (.056)
1. Dan Brouthers, BUF (112)
2. Roger Connor, NYN (102)
3. Ezra Sutton, BSN (92)
4. George Gore, CHN (92)
5. Jim O’Rourke, BUF (91)
The expanded schedule enabled Brouthers and Connor to become the first National Leaguers to create an estimated 100 runs in a season.
1. Fred Dunlap, CLE (194)
2. Dan Brouthers, BUF (191)
3. Roger Connor, NYN (189)
4. Ezra Sutton, BSN (159)
5. George Gore, CHN (157)
Trailer: Stump Wiedman, DET (45)
Trailing non-pitcher: Frank Ringo, PHI (53)
1. Fred Dunlap, CLE (+4.5)
2. Dan Brouthers, BUF (+4.4)
3. Roger Connor, NYN (+4.2)
4. Ezra Sutton, BSN (+3.0)
5. George Gore, CHN (+2.7)
Trailer: Stump Wiedman, DET (-2.6)
Trailing non-pitcher: Davy Force, BUF (-2.0)
1. Fred Dunlap, CLE (+6.4)
2. Dan Brouthers, BUF (+5.4)
3. Roger Connor, NYN (+5.3)
4. Ezra Sutton, BSN (+4.8)
5. Buck Ewing, NYN (+4.4)
Trailer: John Humphries, NYN (-.9)
Humphries was the Gothams’ #2 catcher behind Ewing. He made 85 outs in 108 PA, hitting .112/.120/.121 and creating just one run. Another Gotham reserve, outfielder Gracie Pierce, is next on the trailing list, making 50 outs in 63 PA, hitting .081/.095/.113 and creating just one run.
1. Hoss Radbourn, PRO (72)
2. Jim Whitney, BSN (74)
3. Charlie Buffinton, BSN (83)
4. Larry Corcoran, CHN (86)
5. Jim McCormick, CLE (86)
Trailer: Art Hagan, PHI (178)
1. Hoss Radbourn, PRO (+4.7)
2. Jim Whitney, BSN (+3.4)
3. Pud Galvin, BUF (+2.1)
4. Larry Corcoran, CHN (+1.7)
5. Charlie Buffinton, BSN (+1.5)
Trailer: John Coleman, PHI (-7.5)
1. Hoss Radbourn, PRO (+8.1)
2. Jim Whitney, BSN (+7.8)
3. Charlie Buffinton, BSN (+2.8)
4. Pud Galvin, BUF (+2.2)
5. Monte Ward, NYN (+2.1)
Trailer: John Coleman, PHI (-5.6)

My all-star team:
C: Buck Ewing, NYN
1B: Dan Brouthers, BUF
2B: Fred Dunlap, CLE
3B: Ezra Sutton, BSN
SS: Jack Glasscock, CLE
LF: George Wood, DET
CF: George Gore, CHN
RF: Orator Shaffer, BUF
P: Hoss Radbourn, PRO
P: Jim Whitney, BSN
P: Charlie Buffinton, BSN
MVP: 2B Fred Dunlap, CLE
Rookie Hitter: C Mike Hines, BSN
Rookie Pitcher: Charlie Buffinton, BSN

I gave George Wood the nod in left field as he was close to the top in WAR (Tom York and Jim O’Rourke were in the mix as well), but Pete Palmer has him at +7 in the field, easily the best of those three. Again, the right fielders were a weak crop; Paul Hines, the runner-up in center field, was 1.7 WAR better than Shaffer.

The pitchers were the same pair as last season, but in reverse order. Radbourn led the league in ARA (meaning he was the most effective on a per inning basis) and tossed 632 innings, second only to Pud Galvin’s 656, an unbeatable combination even for Whitney’s bat. I also added a third spot as the AA averaged close to three 100 IP pitchers per team.

It was another very weak year for rookie hitters; while Mike Hines only put up a 70 ARG, I felt that being the catcher for a pennant-winning team should probably count for something--and it’s not as if he’s taking it away from a guy who had a great year (the top rookie hitter in WAR is Cliff Carroll, +.9 and also a below-average hitter).

Saturday, July 11, 2015

Collapse, pt. 2

For the second time in three years, OSU baseball entered May in excellent position to secure the program's first NCAA tournament berth since 2009, projected as a #2 seed with an outside shot of earning a #1 and hosting a regional. For the second time in three years, it all came crashing down around hapless coach Greg Beals. This one was even tougher to swallow. In 2013, the crash mostly came in non-conference games against tough national opponents (Georgia Tech, Louisville, Oregon), and the Buckeyes still came close to a share of the Big Ten title. In 2015, OSU tumbled down the Big Ten standings, losing eight of their final nine conference games to fall to seventh in the league.

Despite this catastrophic failure, the stagnation of the program under Beals' stewardship, and the expiration of his inital contract, it appears as if Beals will be invited back for a sixth season leading the OSU baseball program. Under previous coach Bob Todd, the Buckeyes and Minnesota duked it out for conference supremacy, a veritable big two and little eight on the diamond. While Todd's program slipped a bit near the end of his tenure, he still made semi-annual NCAA appearances and captured a final regular season crown in 2009. Under Beals, OSU has clearly fallen behind at least Indiana, Illinois, and newcomers Nebraska and Maryland in the Big Ten pecking order.

The Big Ten got five teams into the NCAA Tournament: Indiana, Iowa, the forces of evil, Maryland, and Illinois, with the latter two winning their regionals. But the overall performance of conference teams only adds to the frustration for OSU supporters, as the Bucks finished 35-19 (.648), third in the Big Ten (Illinois led at 50-10, .833). In EW%, OSU was fifth at .636 (Illinois led at .783), and in PW% OSU was third at .643 (again, the Illini led at .748). OSU tied for fourth with 5.63 runs/game against a conference average of 5.36 and sixth with 4.22 RA/game against an average of 4.87.

OSU's offense was paced by its outfield, with the three primary starters ranking 1-2-3 on the team in RAA. Sophomore left fielder Ronnie Dawson took a step back from his debut campaign, but still hit .279/.357/.465, ranked second on the team with 7 longballs, and created 6.3 RG for +9 RAA. Classmate leadoff man and center fielder Troy Montgomery broke out in a big way, hitting .317/.424/.493 for 8.7 RG and +22 RAA. And senior right fielder Pat Porter played himself into being a fifteenth round pick of Houston with a bounceback .338/.414/.576, eleven homer, 9.4 RG, +25 RAA season that also saw him set the school's career triples record.

The platoon of senior catchers Aaron Gretz and Conor Sabanosh was fairly effective, with 176 and 189 PA respectively, Gretz created 6.1 runs/game and +5 RAA, Sabanosh 5.0 and +1, although it once again mystified this observer that Beals showed a degree favoritism towards Sabanosh in doling out playing time. First base was a major weakness. Jacob Bosiokovic missed most of the season, taking one option out of play. Junior Ryan Leffel struggled at the plate, hitting .211/.304/.267 for 3.3 RG and -4 RAA in 109 PA, while classmate Zach Ratcliff hit well (7.1 RG) in limited opportunities (64 PA). Eventually junior Troy Kuhn saw time at first after starting the season at the hot corner, and was Ohio's most productive infielder with 5.4 RG for +2 RAA in 187 PA.

After Kuhn shifted across, the diamond, junior second baseman Nick Sergakis moved to third. Sergakis did not reprise his strong 2014 campaign with a .250/.320/.330, 4.3 RG, -3 RAA season. Sophomore L Grant Davis plugged in at second, hitting .282/.320/.353, 4.7 RG, -1 RAA over 102 PA. And an early hot streak kept junior Craig Nennig from a third straight dismal offensive performance, but he still only hit .266/.327/.330 for 4.3 RG and -3 RAA.

There was no regular DH, so the only other Buckeye who got more than 50 PA was freshman outfielder Tre' Gantt, who showed promise with speed and a .311/.378/.351, 5.6 RG, +2 RAA performance over 85 PA--he should be a shoe-in for the vacant outfield spot. Freshman catcher/DH/pinch-hitter Jordan McDonough showed some gap power with six doubles in 41 PA (4.7 RG). Sophomore IF/C Jalen Washington and junior OF Jake Brobst served mostly in pinch-running roles, combining for just 37 PA.

OSU's starting pitching was solid, but that's about the strongest praise that can be offered. Sophomore lefty Tanner Tully was slotted as the #1 but not surprisingly took a big step back from his Freshman of the Year campaign with a 5.04 RA (and even more distressing 5.75 eRA) and -3 RAA over 75 innings. His strikeout rate ticked up from 5.1 to 5.3, which tells much of the story. Sophomore Travis Lakins moved into the rotation but was just average (4.78 RA for -1 RAA with a similar eRA in 96 innings), but clearly was the best mound prospect on the team and was draft eligible, signing with Boston after being a sixth round pick. Senior Ryan Riga was OSU's best pitcher and was drafted in the thirteen round by the White Sox after positing a 3.38 RA for +12 RAA over 97 innings).

Freshman Jacob Niggemeyer got the most mid-week starting assignments with seven, but will need to improve on a 4.09 RA, 5.35 eRA, and 4.1 K/9 performance to be a strong candidate for weekend starts in 2016. Redshirt freshman Adam Niemeyer worked in something of a long relief role, pitching 33 innings over 12 appearances (4 starts) with a 2.16 RA and 3.79 eRA for +9 RAA to rank second to Riga on the team. Junior lefty John Havird (3.58 RA, 4.17 eRA in 27 innings) will also be in that mix.

The Buckeye bullpen struggled immensely, particularly down the stretch and in the Big Ten Tournament. In the last regular season weekend at Indiana, the bullpen failed to hold a 4-3 eighth inning lead in the opener, then surrendered two runs with the game tied in the eighth in the finale. In the Tournament, closer Trace Dempsey yielded a two-out, two-strike homer in the opener against Iowa that turned a 2-1 lead into a 3-2 loss. Scarred by the experience, Beals stuck with his ace Riga in the eighth inning of a 2-2 elimination game against the Hoosiers; they struck for three and ended OSU's season, 5-3.

Dempsey's senior season saw him once again unable to catch his sophomore lightning in a bottle--he was average but not brilliant (4.46 RA, 3.98 eRA, +1 RAA, with 8.4 K/9 against 2.4 W/9, the latter a marked improvement from 4.9 in 2014). The rest of the pen was weakened by a late season injury to junior Jake Post, who was the best Buckeye reliever with a 3.03 RA, 4.05 eRA, +6 RAA over 29 innings. Freshman Seth Kinker pitched well (2.82 RA with similar eRA for +5 RAA, 7.7 K/1.2 W over 22 innings0 and marks a continuation of one of the few positives of Beals' style--a fondness for relievers with less than over-the-top deliveries. As the season progressed Kinker took some high leverage work away from redshirt freshman Kyle Michalik, who pitched better than his traditional stats might indicate, albeit in only 19 innings (5.21 RA but 2.92 eRA). Senior Michael Horejsei really only had his left-handedness to offer him as a key reliever, ranking second to Dempsey with 19 appearances but tossing just 14 1/3 innings with a 6.91 RA and 5.57 eRA. Redshirt sophomore Shea Murray came back from an arm injury to throw seven ineffective but exciting innings (7.04 RA, 4.7 W, 14.1 K); his stuff was good enough for Texas to take a flier on him in the 39th round.

As far as the state of the program goes, there's really nothing to be said that I haven't already. Perhaps one could credit Beals for some apparent development by offensive players (Montgomery in particular comes to mind), but his track record in that regard is still quite sketchy. The dreadful baserunning and bizarre infatuation with the runners at the corners, two out double steal show no sign of abating, the late season collapse is fast becoming a staple, and while a 35-20 record doesn't look so bad, Beals' five-year body of work is 159-125 (.560), the program's worst stretch of that length since 1986-1990 (.500), except for the overlapping 2010-2014 period (.543). The same holds true for conference play where Beals is 49-47 (.510). The six-year NCAA Tournament drought is the longest since OSU went eight years without qualifying 1983-1990. It's time for OSU to once again have a baseball program it's football and basketball programs can be proud of, and it seems likely that someone other than Beals would be best positioned to make that happen.

Wednesday, June 17, 2015

Great Moments in Yahoo! Box Scores

I think this means I've won. They've finally just given up.

Monday, May 04, 2015


Continuing the tradition of haphazard “book reviews” appearing on this blog well past the time that such a review would be relevant, I recently read The Sabermetric Revolution by Benjamin Baumer and Andrew Zimbalist and have a few thoughts on the book.

On the whole, I am not a fan of the book. While I am not personally very familiar with Baumer’s work, Zimbalist is a seminal figure baseball economics (starting over twenty years ago with his Baseball and Billions). Unfortunately, The Sabermetric Revolution is too short (153 pages of prose not counting footnotes) and too unfocused to really showcase the authors’ knowledge.

In many respects it appears that the book was intended to be something of a rejoinder to Moneyball, both by pointing out areas in which Michael Lewis either played fast and loose with the facts or omitted key details. The preface is clear about this motivation, as the authors write: “This book will attempt to set the record straight on Moneyball and the role of ‘analytics’ in baseball.”

There’s no doubt from reading the book that this is a major goal of the authors, as the first chapter is devoted to “Revisiting Moneyball”. I found some of the criticism to be fair (for example, Lewis’ tendency to gloss over the contribution of young talent the A’s had produced that contributed to the team’s success, such as Eric Chavez, Miguel Tejada, Tim Hudson, Mark Mulder, and Barry Zito). Some, though, strikes me as of 20/20 hindsight (such as a review of the infamous 2002 amateur draft) or nit-picking (such as the fact the A’s OBA decreased in 2002 despite Beane’s emphasis on OBA). In other places, I would contend the authors are guilty of some of the same offenses they accuse Lewis of (for example, they state that Lewis gives short shrift to the work of Bill James and other sabermetric pioneers; however, their own discussion of the internet sabermetric community begins at Baseball Prospectus).

The fundamental issue I had with the book is that it is not clear what it is intended to be (aside from a Moneyball response) or who the intended audience is. The book is not detailed enough to serve as a technical introduction to sabermetrics for newcomers (for instance, I’m not sure park factors are ever discussed outside of brief allusions), but neither is it detailed or advanced enough to strongly appeal to the smaller audience of practicing sabermetricians. There is even a chapter on statistical analysis in other sports, a topic on which I am closer to the novice group, but it also is short on details, even more glaring of an omission since at least there is a quick overview of sabermetric theory.

At the cost of myself falling into the trap of nit-picking this book, I think listing a number of my issues with the book might be the easiest way to write it up:

* There are also a number of incorrect acronyms used in the book, some of which were surprising to me. OPS is said to be an acronym for “Offensive Performance Statistic”; DER an acronym for “Defensive Efficiency Rating”.

* The authors state that the formula for Isolated Power weights doubles and triples equally and is roughly the difference between SLG and BA, “or sometimes” is (D + 2T + 3HR)/AB. While I understand the argument for treating doubles and triples equally in a power metric, Isolated Power is not “sometimes” defined as SLG minus BA. That formulation has been used in conjunction with the term “Isolated Power” since Branch Rickey linked the two (but did not set them equal) in his 1954 Life magazine article and it was used in the manner by Bill James. While this or the meaning of the OPS acronym may seem like insignificant details, they suggest something less than a full command of sabermetric history.

* The authors state that in economic terms, WAR measures “marginal physical product” and state that this is a good idea, but are not fans of the methodology used to calculate current WAR implementations. Their concerns include fair ones, such as failure to report error bars and the use of black box methodologies. But while their reasoning behind these criticisms are clearly laid out, they sometimes engage in what might be called “drive-by” criticisms, in which issues are alluded to but not fully fleshed out to the point where the creators and users of these metrics could offer a defense. In this manner, Baumer and Zimbalist reflect the attitude of another “insider” who has criticized replacement-level metric, Christopher Long.

One such comment is “It is not clear that there exists a pool of replacement players with the productivity that is ascribed to them”. This basically questions the entire concept of replacement level, but is not supported other than with a footnote to site the work of JC Bradbury. This does nothing to forward the discussion of replacement-level, nor does it alert the readers to the well-reasoned and spirited rejoinders sabermetricians have issued to Bradbury’s contentions.

The authors then use a single example to question what is one of the least controversial and most similar step in any WAR methodology--the run to win conversion. The authors simply write: “The use of James’ Pythagorean Expectation to convert runs to wins is less than robust. One need only reflect on the 2012 Baltimore Orioles, who outperformed their expected win total by 11 games, to see how inaccurate the runs to wins conversion can be.”

If I may be impolitic and a bit unhinged for a moment, the authors should be ashamed of themselves for this statement. It is the type of statistically illiterate cherry-picking that one might expect from a Bill Madden rather than from respected professionals familiar with statistical methods. While it is without question true that win estimators (like every other statistical estimator known to man) produce poor estimates in certain individual cases, a reasoned discussion of their error bars does not begin and end with a single poor estimate. Any regression equation presented by Zimbalist in Baseball and Billions or in this work could be easily impugned by similar rhetoric, and likely more effectively given that win estimators are among the more accurate and stable estimates one will find in baseball analytics.

It might also be pointed out that the run to win converters actually used in WAR calculations are likely more robust (in the true meaning of the term, rather than denoting a single outlier) than Pythagorean by recognizing that the shape of the relationship between runs and wins changes as the scoring environment changes. While the authors are surely aware of this, one could never tell from the discussion of run/win estimators in the book, as only Pythagorean constructs with fixed exponents are discussed, with no reference to alternative exponent constructions like Pythagenport/pat or dynamic linear run to win estimators.

* My sense, and it may be unfair, from reading the book, is that Baumer and Zimbalist are eager to emphasis areas and issues in which sabermetric findings have been wrong and/or incomplete. An example is the discussion on sacrifice bunts, which points out that the initial sabermetric analysis (they do not reference Palmer and Thorn by name in this section, but The Hidden Game of Baseball is the usual source of the classical argument) was incomplete in not considering the other outcomes that may occur on a sacrifice bunt attempts, such as bunt hits and errors.

This is without question a valid criticism. However, neither Baumer/Zimbalist nor other present day critics of the conclusion acknowledge that the conventional wisdom that was pushed back against was not that the bunt was a good play because of those outcomes, but that the sacrifice if successfully executed was a good play. I still find myself as one of the few patrons clapping when I attend a game and the team for which I am rooting successfully records the out at first base on a sacrifice. This play was seen, and still is seen by casual fans and presumably a non-negligible portion of major league managers, as a success for the offense, even without the benefit of the error or hit that make the play a palatable strategy in certain situations. Sabermetricians have moved to a more “nuanced understanding” of the sacrifice, but they have also forced the conventional wisdom to tack on a bunch of addendums and hypotheticals that had rarely been discussed before.

* In other cases it is unclear how deep of a literature review of the field the authors have performed. For instance, the authors criticize FIP due to using an ERA scale (a criticism with which I agree but also note can be relatively easily corrected) but state that “What this field needs is a simple, illustrative, but effective model to evaluate pitchers. Until a model can be constructed with interpretable coefficients (a la linear weights), or with meaningful interaction of terms (a la Runs Created), no real insight will be gained, and there is unlikely to be any consensus about which metric is best.”

In all,The Sabermetric Revolution is a book that I think might have been better conceived as a couple of separate journal articles on the topics on which Baumer and Zimbalist have something new to say, because the rest of the book feels like filler and does not establish a consistent purpose or tone.

Wednesday, April 15, 2015

Reinventing the Wheel, Now With Win Estimators!

It is in my nature to snark about bad baseball analysis. Maybe more of it is nurture, as much of my early sabermetric reading was the younger Bill James, with later exposure to early BP and other derivatives, where snark was an integral part of the culture.

That is not really intended to be an excuse, although it may well read that way. As I have grown older I believe that I have generally become more aware of how little I actually know, but more consequently to snarking, less interested in engaging. I have lost almost any desire I ever had to evangelize about sabermetrics to the “unwashed masses” (now there’s a snarky, loaded term). Instead I am content to write to my very small audience, which even so is almost entirely based on what I want to write rather than what I think anyone might want to read, and take passive-aggressive potshots on Twitter. This probably still tilts me more towards the jackass side of the scale than the average sabermetrician, but so be it.

Every once in a while, though, I run across something that irks me so much that I have to respond to it in full. Against my better judgment, I feel compelled to draft a polemic in response, even though I know there’s nothing good that can possibly come of it. That is the case with an article that appeared in the Fall 2014 issue of SABR’s The Baseball Research Journal entitled “A New Formula to Predict a Team’s Winning Percentage” and written by Stanley Rothman, Ph.D.

Historically, the quality of sabermetric articles in the BRJ has been a mixed bag. Early BRJ editions included seminal research by pioneers of the field like Pete Palmer and Dick Cramer. Eventually the quality of such articles significantly dropped off, and BRJ was a leading purveyor of the rehashing of bases/X metrics that I rail against , and other equally banal statistical pieces with notable but rare exceptions. (That is particularly amusing since in the heyday of BRJ as a place where sabermetric research was published, Barry Codell introduced Base-Out Percentage, one of only a few times that metric could have been legitimately been said to have been “invented”).

In recent years, the quality of the statistical pieces in BRJ has been significantly improved, so I hope that my mockery of this particular piece is not taken as an indictment of the entirety of the body of work the editors (now Cecilia Tan) have been doing on this front. In fact, the Fall 2014 issue features a couple of sabermetric pieces I enjoyed greatly, both based on Log5 and other predictors of head-to-head matchups (John A. Richards’ piece “Probabilities of Victory in Head-to-Head Matchups” covered the theoretical basis for Log5 and a comparison of Log5 estimates to empirical results, and Matt Haechrel did likewise for individual batter-pitcher matchups in “Matchup Probabilities in Major League Baseball).

Dr. Rothman’s piece is an unfortunate exception. And since I consider myself (perhaps incorrectly so) to be something of a subject-matter expert in winning percentage estimators, I feel compelled to point out areas in which Rothman’s findings bury obvious, well-established principles in a barrage of linear regressions.

Rothman opens his paper by discussing Bill James’ ubiquitous and groundbreaking Pythagorean method, and then asks “Why not just use the quantity (RS-RA) to calculate EXP(W%)”? Why not indeed? This question is never satisfactorily answered in the paper. Nor is it even addressed henceforth.

Rothman proceeds to set up a W% estimator that he christens the Linear Formula as:

EXP(W%) = m*(RS-RA) + b

Note that Rothman’s terms RS and RA are just that--runs scored and runs allowed by a team. Not per game, per inning, or on any other sensible rate basis--raw, unadulterated seasonal totals.

Next, he provides the standard equations for m and b, and makes some simplifying assumptions. His regressions are run separately for each MLB season, so each team’s number of games is 162 (obviously there are some limited and non-material exceptions) and there are 30 observations in each regression (Rothman uses 1998-2012 data in his analysis). After these substitutions, the intercept b is equal to .5 and the slope m is:

m = SUM[(RS - RA)*W%]/SUM[(RS - RA)^2]

Rothman notes that for major league seasons viewed in aggregate, there is a strong correlation between SUM(RS - RA)*W% and SUM(RS - RA)^2, and so he develops a formula to predict the latter from the former:

EXP[SUM(RS - RA)^2] = 1464.4*SUM[(RS - RA)*W%] + 32710

This is substituted into the regression formula for expected W% with the intercept dropped since it has little impact to get the following equation:

EXP(W%) = SUM[(RS - RA)*W%]/{1464.4*SUM[(RS - RA)*W%]}*(RS - RA) + .5

= .000683*(RS - RA) + .5

This is the final formula that Rothman refers to as the Linear Formula. At this point, I will offer a few of my own comments:

1) There is nothing novel about presenting a W% estimator based on some relationship between run differential and W%. The rule of thumb that ten runs equals one win is just that. One of the earliest published W% estimators, from Arnold Soolman, was based on a regression that used RS/G and RA/G as separate variables but could have just as easily used the difference (and the insignificant difference in regression coefficients for the terms back that up).

2) The author’s choice to express this equation on a team-seasonal basis is, frankly, bizarre. It results in the formula being much less easy to apply to anything other than team seasonal totals, and it obscures the nature of the relationship between runs and wins, hiding the fact that this is little different than assuming ten runs per win. If you divide 1464.4 by 162 games/season, you find that the formula implies 9.04 runs per win and would be more conveniently expressed as .1106*(RS - RA)/G + .5.

3) I don’t understand the rationale for using a separate equation for each league-season, then developing a single slope by running another regression of various league quantities. It would be much more straightforward to combine all teams from the data set together and run a regression. Such an approach would also result in a higher R^2 for the team W% estimates. I don’t think that maximizing R^2 should be a paramount in constructing a W% estimator, but in this case I fail to see the advantage of not studying the relationship between runs and wins directly at the team level rather than aggregating team-level regressions across multiple seasons.

Returning to the article, Rothman uses a Chi-Square test on 2013 data to compare the Linear Formula to Pythagorean. Setting aside the silliness of using thirty data points for an accuracy test when hundreds are available, I must give Rothman credit for not using the Linear Formula’s better test statistic to trumpet its superiority--instead he writes that “there is no reason to believe that both of these formulas cannot be used.”

The article than includes a digression on applying this approach to the NBA and NFL. The conclusion and “additional points” sections of the article provide a handful of interesting contentions:

* Rothman suggests that one of the chief advantages of the Linear Formula is that it is “easier for a general manager to understand and use”. The premise is that GMs can use the Linear Formula to calculate the marginal wins from player transactions.

While there is certainly nothing wrong with these types of back of the envelope estimate, this comment would have been less bizarre twenty years ago. Now it seems incredibly na├»ve to suggest that the majority of major league front offices could improve their planning by using a dumbed down win estimator. It’s hard to determine which is sillier--the notion that front offices that would entertain such analysis would not be using more advanced models (the outcome suggested by which would depend much more on the projection of player performance than how that performance is translated into wins), or the notion that front offices who were so inclined and needed to do back of the envelope calculations would not be able to grasp Pythagorean.

* Apparently referring to the approximation used to derive the multi-year version of the formula above, Rothman asks “Why is there a strong positive correlation between SUM[(RS - RA)^2] and SUM[W%*(RS - RA)] in MLB?”

I might be accused of under-thinking this, but my response is “Why wouldn’t there be?” The key quantity in each sum is run differential. We know that run differential is positively correlated with W% (if it were not, this article would never have been written), so it should follow that the square of run differential (or the square root, the cube, the logarithm, any defined function) should have some relationship to the winning percentage times the run differential. And since the quantities Rothman is comparing are sums on the league level, both should increase as the differences between teams increase (i.e. if all teams were .500 and had zero run differentials, both quantities would be zero. As teams move away from the mean, both quantities increase).

* Rothman notes that if a team’s run differential is greater than 732, than the linear formula will produce an estimated W% in excess of 1.00. “However, this is not a problem because for the years 1998-2012 the maximum value for (RS - RA) is 300.”

Note that Rothman does not discuss the opposite problem, which is that a run differential of -300 will produce an equally implausible negative W%. But the hand-waiving away of this as a potential issue coupled with the posed but unaddressed question “Why not just use the quantity (RS-RA) to calculate EXP(W%)?” is why this article got under my skin.

If Dr. Rothman has taken five seconds to consider the advantages and disadvantages of how to construct a W% estimator, scant evidence of it has manifested itself in his paper (and given as this is a commentary on the paper and not Dr. Rothman himself or whatever unpublished consideration he gave to these matters, that is all I have to go on). There is certainly nothing wrong with experimenting with different estimators, but these experiments should not rise to the level of publication in a printed research journal unless they yield new insight in some way. Nothing in Rothman’s piece did--in fact, given the bizarre manner in which he chose to express the equation, I would suggest that if anything the piece regresses the field’s knowledge on W% estimators.

So allow me the liberty of answering Rothman’s question and the hand-waived problem for him.

Q: Why not use run differential to estimate W%?

A: Because doing so, at least through the simple linear regression approach, does not bound W% between zero and one, does not recognize that the marginal value of runs is variable, and does not recognize that the value of a run is dependent on the scoring environment.

Other than that, it’s great!

“Why not?” is a great reason to experiment, but it’s not a great reason to formally propose a new method (well, really, recycle existing methods, but I’m piling on as it is). There is also nothing wrong with using a model with certain deficiencies that other models avoid, whether due to computation restrictions, ease of use, a lack of deleterious effect for the task at hand, etc. But it should be incumbent on the analyst and the publisher to acknowledge them.

Finally, anyone publishing sabermetric research in this day and age should recognize that whatever new approach you believe you have developed for a common problem (like win estimation, or measuring offensive performance), it’s probably not new at all. This is certainly the case here given the work of Soolman, the rule of thumb that ten runs equals one win, the dynamic runs per win formula used in The Hidden Game of Baseball and Total Baseball by Pete Palmer, and other related approaches. All of these are based on the basic construct W% = m*run differential + b.

Personal anecdote: I don’t remember when this was exactly, maybe when I was in the eighth grade, but in our math class we were learning about linear equations of the form y = mx + b and there was an example in the textbook that showed how one could eyeball a line through a scatterplot and develop the equation for that line. In other words, a manual, poor man’s linear regression.

So I did just that with a few years of team data, plotting run differential per game against W% (I want to say I used 1972-74 data), and came up with W% = .1067*RD + .5. Foolishly, I actually used this for W% estimates for a period of time. Thankfully, I was cognizant that it was not a new approach but rather just a specific implementation of one developed by others, and I did not attempt to/no one permitted me to publish it as if it was. Years later, W% = .1106*RD + .5 appeared in the pages of the Baseball Research Journal.

So that this post might have some smidgeon of lasting value, I will close by reiterating the three conditions of an ideal win estimator that such linear constructs fail to satisfy. I have written plenty about win estimators in the past (and will doubtlessly rehash much of it again in the future), but I don’t believe I’ve explicitly singled out those properties. An ideal W% estimator would satisfy all three, which is not to say there is no use for an estimator that satisfies only two or even zero. The Linear Formula satisfies none. I will discuss how three of the common approaches perform: Pythagorean (with fixed exponent), Pythagenpat, and Palmer (RPW = 10*sqrt(runs per inning by both teams). Palmer can serve as a stand-in for any method that allows RPW to vary as the scoring level varies, and of course there are other constructs that I am not discussing.

1. The estimate should fall in the range [0,1]

The reason for this is self-explanatory. Pythagorean and Pythagenpat pass, while Palmer does not. Obviously this is not really an issue when you apply the method to normal major league teams. It can become an issue when extrapolating to individual/extreme performances, though.

2. The formula should recognize that the marginal value of runs is variable.

This is somewhat related to #1--the construct of Pythagorean results in it passing both tests. However, there are other constructs that are bounded but fail here. Palmer fails here, which is inevitable for a linear formula. The gist here is that each additional run scored is less valuable in terms of buying wins and each additional run prevented is more valuable. This is also the hardest to articulate and the hardest to prove if one has not bought into a Pythagorean-based approach (or examined other W% models such as those based on run distributions).

3. The formula should recognize that as more runs are scored, the number of marginal runs needed to earn a win increases.

This could be confused with #2, but #2 is true regardless of the scoring level in question--it's true in 1930 and in 1968. In this case, the relationship between runs and wins changes as the run environment changes. This is where a fixed exponent Pythagorean approach falls short, while both Pythagenpat and Palmer take this into account.