Monday, June 20, 2016

Cleveland

I have a love-hate relationship with Cleveland.

I was born and raised in the exurbs. I left for college but eventually came back to the suburbs but only because I had to, not because that’s where I wanted to live. When it comes to sports, I always rooted for the Cleveland teams, but in all honesty have never been a diehard Cavs fan, that description being much more apt for how I follow the Indians and the Browns. Even in those cases it might not fully apply, since I pride myself on striving to be too rational about baseball to fall into the sheer emotion of fandom (post-childhood), and the Browns have been too bad for too long to not laugh at rather than lament the losses. My sheer sports fan emotions, assigning good v. evil to every game and opponent, living and dying with the team was transferred to my future alma mater around adolescence and will never be directed elsewhere.

Yet there's no question that my baseball team is the Indians, my football team is the Browns, and my basketball team is the Cavaliers. At times this has been embarrassing. Not due to the failure to win championships, but more due to the Cleveland fan culture. Cleveland fans have taken pride in their victimhood, with the heartbreaks (sometimes more real than imagined) a perverse source of pride. Where else does a 35 year distant divisional playoff game (i.e. two games removed from the championship) have a name that every young Browns fan learns ("Red Right 88")? Cleveland had some bad breaks, but more often than not they just had bad teams. Bad management, a little bad luck, and on the rare occasions when the teams had a chance to win it all, the dice rolls were not kind. But the only way one can reasonably expect to win championships is to put multiple championship-caliber teams on the field and let the chips fall where they may. Cleveland's three largely failed in that regard.

Of course, franchise ineptitude is largely not the fault of the fanbase, but there's an important distinction to be made between losing because you're bad and losing because fate didn't look kindly upon you on a given day. Cleveland fans too often conflated the two, resulting in a fatalistic feedback loop that took the former as evidence of the latter.

The other maddening element of the sports culture is the unique grip that the Browns have on the city. For all of the elation that the Cavs victory has brought, it will pale into comparison to the day the Browns win or even make a Super Bowl. The Browns still rule the landscape, and benefit from a remarkable double standard. The Indians, a franchise that has achieved more in any one of nine seasons in the last twenty-five than the Browns have in any, are struggling with attendance. There is an overwhelming cynicism towards the Indians, rooted in a lack of understanding of the economics and nature of baseball. Every trade of a free agent to be furthers the downward spiral of the relationship between the city and the team, even as those trades bring back the future objects of lament (e.g. as Bartolo Colon becomes Cliff Lee becomes Carlos Carrasco). This is not to absolve the Indians of their very real failures in drafting/international signings that only now appear to be reversing, but the Indians have run rings around the Browns and yet it is the former that I will be pleasantly surprised to see take the field in Cleveland rather than Montreal or Portland or San Antonio in, say, 2030.

Cleveland fans have also had the opportunity to root for a winner, but many have passed it up, and I cannot feel too sorry for them. In the last twenty years, OSU has won two national titles in football and been to three Final Fours in basketball. A college team might always belong to the students and alumni most dearly, but the surely the flagship state university is as much an available rooting interest as a private entity that can be moved to Baltimore at the owner's whim.

Sports irrationality aside, one thing I will say for Cleveland and Northeast Ohio is that there is a real pride in their hometown among people here. I'm not well-traveled enough to declare that this feeling is unique, but I can contrast it to my other hometown, Columbus. People in Columbus don't generally exhibit the same pride in their city that Clevelanders do in theirs. Columbus residents might be proud of OSU or proud of Ohio, but they aren't as proud of Columbus per se. Some of this may be due to sports teams; minus the recent (and so far unsuccessful) addition of the Blue Jackets, Columbus' sporting identity ties to OSU and thus more to the state than the city.

This is why it is so appropriate that Cleveland's title drought was ended about as single-handedly as one could be by one man, LeBron James. LeBron was a native son, and that meant something here. Everyone probably feels some sort of connection to LeBrown, however tenuous or forced. Mine is that LeBron and I are the same age. LeBron was a first-name basis celebrity by the time we were freshmen in high school. The night of the 2003 NBA Draft Lottery, I was in a cabin at a state park in Southeast Ohio on our mandatory "senior trip" watching as the Cavs came up with the #1 pick (and a parent chaperone insisted that they had to draft Carmelo Anthony).

LeBron was asked to shoulder the burden of the city himself, and unfair ask for anyone but especially for a rookie. And when he dragged a ragtag bunch to the team's first ever Finals appearance, he simultaneously hurt both his "legacy" in the ridiculous media environment when the Cavs were swept and obliterated any remaining thought of patiently building a worthy team around him. For the next three seasons the Cavs chased in vein, leaving him with the impossible choice of staying to try to drag this motley crew to the promised land, or going to chase titles with a group of stars.

Of course, the way “the Decision” went down was an extra gut punch, but while many Cleveland fans condemned LeBron, I’d like to think that I was fairly level-headed (this comment on The Book Blog is as intemperate towards LeBron as I got, and I still didn't lose sight of who the real villain in sports is):

The real villain in this whole thing, IMO, is ESPN. They cannot try to pass themselves off as a news outlet of any sort when they are willing to whore themselves out for an hour as a player’s personal press corps.

It has been apparent for years that ESPN wants to be part of the stories it reports on, but it has never been more plain to see then it was last night.

On another note, I do not support the childness of the Cleveland fans, but it is worth noting where they are coming from. Cleveland has not won a major sports title since 1964 despite fielding three teams (at least since 1970)--and that one isn’t even celebrated by anyone outside of Cleveland because of the NFL’s whitewash of its pre-Super Bowl history.

Yet here, nearly miraculously, was a local player who just so happened to be the best basketball prospect in anyone’s memory. He was not called the Chosen One for no reason. He was the one who was destined to finally break through the wall, to give Cleveland its championship. Twice the team has looked liked the NBA’s best team in the regular season, only to fold in the playoffs.

I’m not saying that those expectations and hopes were fair, that they all should have been thrown on LeBron’s shoulder. They plainly weren’t. Still, I may be a little biased, but I think that all things considered, this has to be one of the biggest kicks in the gut that an athlete leaving a team as a free agent has ever delivered. That doesn’t excuse Dan Gilbert or Cavs fans, but this is not just ARod leaving the Mariners.


The truth of the matter is that I was still a LeBron fan. The most endearing quality of LeBron in a sports sense to me was his support of OSU. He was on that bandwagon prior to the 2002 national championship (an article on his senior season of high school included an account of his enthusiastic celebration of this victory with his teammates), and he remained a friend of the program even after going to Miami, which he didn’t have to do. Sure, OSU is an important college asset of Nike, but Nike has contracts with a million other colleges, there was no need to keep up appearances.

It seemed like a crazy pipe dream in the spring of 2014, but was realistic by summer, and then remarkably came true. LeBron was coming back, to try to lead a new supporting cast, rebuilt largely through the fruits of the lottery picks that never would have come had he stayed (Kyrie Irving, Tristan Thompson, Dion Waters, Kevin Love via trade). For all of the fury that surrounded The Decision, had LeBron’s goal all along been to win a title in Cleveland, he couldn’t have done any better.

However, nothing is assured, and being Cleveland the natural insecurities were ratcheted up a few levels. Had the moment for LeBron passed, was he just far enough past his prime that he could not deliver? Was the supporting cast good enough (or in the case of Irving, healthy enough) to provide him support? Unexpectedly a new question emerged--would the suddenly dominant Warriors serve an insurmountable foil?

Hopefully the answers to those questions will reduce some of the irrationality of Cleveland sports observers (of course, the probability of the Cavs winning when down 3-1 was greater than the probability of a collective of sports fans becoming more rationally). Whatever small change to the city’s sports mindset might result, Cleveland’s overall inferiority complex is not going to change. On the very morning after the Cavs won the world championship, a new banner appeared on the side of a building bragging that the first traffic light was installed in Cleveland in 1914. This will certainly make the political hacks in town next month to do political hack-y things real impressed. Cleveland is a weird place.

And yet I can watch a ridiculously hokey, wildly overproduced local news commercial from 1995 and find it tugging at my inner childhood Indians partisan in a manner I can’t rationally describe. "Give me a reason for believing in Cleveland." Cleveland can surprise you.

Wednesday, May 25, 2016

Great Moments in Yahoo! PBP

Today Yahoo! unveiled a completely new design for their MLB page. Alas, the PBP now reads backwards and is as ill-equipped to deal with unusual plays as ever:

Wednesday, May 18, 2016

The Only Rule Is It Has to Work

Note: The following is a rare (for this blog) timely book review.

The premise of The Only Rule Is It Has to Work is that respected sabermetrically-inclined authors and podcasters Ben Lindbergh (Baseball Prospectus, Grantland, Five Thirty-Eight) and Sam Miller (Orange County Register, Baseball Prospectus) were given the opportunity to act as the baseball operations department throughout the 2015 campaign of the Sonoma Stompers, member of the four-team Pacific Association, a low-level indy circuit in northern California. Lindbergh and Miller were granted wide berth to put their mark on player acquisition, roster construction, and in-game strategy, and also attempt to bring modern data collection tools (PITCHf/x, video scouting, etc.) to the bush leagues.

Lindbergh and Miller are embedded deep within the team--in the front office, the clubhouse, the dugout, and even (for a moment at least) kangaroo court. Thus it serves as one of the most revealing examinations of daily life in baseball from an outsiders' perspective. Most books that have provided similar access to the inner workings of a team have been written by insiders, even if they might not fully fit into the world in which they have spent many years (think your Jim Boutons). While the life of an indy-league player is certainly less lavish than that of a big leaguer and perhaps less structured than that of an affiliated minor leaguer, it's hard to imagine that the basic human impulses of (largely) twentysomething, athletically-gifted ballplayers varies much between Sonoma and San Jose, San Jose and San Francisco. The authors are able to observe the scene with some combination of bemusement, paternal-ish concern, and comradery to give the audience a different perspective on the people who play the game. Certainly the majority of the audience members can better relate to the authors' stations in life and can now imagine how they might fit in (or not) if thrust into the life of a ballclub.

While it should hardly be necessary at this point for sabermetricians to defend themselves against scurrilous charges of not watching the games, one thing that the authors don’t reflect too closely upon but that is obvious to the reader is just how much low-level baseball they watch over the course of the summer, and just how devoted to their cause they are. Granted, Lindbergh and Miller are aided by a small network of volunteer scouts that earns the derisive nickname "The Corduroy Crew" around the league, but one or the other personally does advance scouting of nearly every game the Stompers' opponents play. This in addition to the hours spent researching potential players with their proverbial noses buried in a spreadsheet. While it would be wrong to hold Lindbergh and Miller's labors (which of course were performed with at least the secondary intent of providing fodder for a book) up as a pure representation of baseball love to be extrapolated to all of their sabermetric compatriots, it would be less wrong to do so than to brandish the common stereotype.

One of the disappointments of the project is that many of the radical ideas the authors dreamed about being able to test are never put into play. While shifts and flexible usage of the relief ace take hold in the second half of Sonoma's season, batting orders largely remain tethered to convention, starting pitchers still generally work in rotation, and the manager holds on to ultimate in-game command. While this may be disappointing to the reader longing for sabermetric red meat, the implications raise questions worth considering. Is it necessary for change in baseball tactics to come one easily digestible piece at a time? Why can a grizzled bench veteran and former pennant-winning manager of a major league team (Clint Hurdle) pivot to the approach his superiors' desire with more aplomb than a 37-year old pot-smoking player-manager who goes by Feh and dabbles in 9/11 conspiracy theories? Do the high stakes of the majors actually make them a more suitable laboratory for experimentation, as players and managers can count on their million dollar checks regardless of whether they may look unconventional on the field? While these questions can't be answered by the book, it provides some entertaining anecdotal evidence to consider.

Along the way, the Stompers inadvertently break ground in the social realm of baseball as well, as one of the authors' hand-picked college signees, relief ace Sean Conroy, comes out as the first openly gay player in professional baseball. The authors do an excellent job of relating this part of the story without falling into self-congratulations or allowing it to swamp the baseball portion of the narrative. Lesser authors with a less interesting baseball story to tell (and perhaps less respect for their subject) could have easily allowed Conroy's story (which includes being one of the Pacific Association's most valuable pitchers) to crowd out other aspects of the Stompers' season in the narrative, and could hardly have been blamed for it.

The authors alternate chapters, and if you are a regular listener (as I am) of their Effectively Wild podcast, you will likely be able to pick out which voice you are reading after a couple of pages even if you forget for a moment whether it is an odd or even chapter. Lindbergh's earnest verbosity and Miller's cheerful nihilism carry through to the written page in book format yet complement each other well, imbuing a diversity of style to the writing while still making you feel as if you are reading the same book.

As luck (or the residue of design) might have it, the story has a dramatic conclusion that I will not spoil here, except to say that I'm very glad the majors have resisted the allure of the half-season format, except for every ninety years when unusual circumstances take hold (if I live to see baseball in 2071 I promise to be grateful and not complain about it too much). Were it ever turned into a movie, the scriptwriter would even have something of a "pick your own adventure" opportunity to affect the outcome with only the proverbial flap of a butterfly's wing.

And maybe that's one of the lasting lessons to take away from The Only Rule Is It Has to Work. That despite the careful planning, the on-the-fly adjustments due to injuries or player poaching (at this level), the dedication of the players and support staff, the superstitious rituals, and the motivational speeches that are poured into baseball clubs, not to mention the attempts to drag baseball kicking and screaming into the sabermetric age, we will never be able to escape what seem from our imperfect perspective to be random rolls of the die.

Tuesday, May 10, 2016

LWR Component Deflators and Replacement Hitters’ Batting Lines

Last time, I explained how we could use Linear Weight Ratio, an offensive metric developed by Tango Tiger, as a shortcut in finding a variable which I call the “component deflator” and symbolized as “a”.

Last time I focused on its application to park adjustments, but that’s not necessary. In fact, the component deflator can be applied generally to any situation in which you’d like to know what across the board percentage change in mutually exclusive offensive events would you have to see in order to alter run scoring by some scalar (assuming you are willing to accept that the linear weight values stay constant, which is certainly an assumption that must be applied with care).

So there are any number of questions that this kind of approach can address. One that I will consider in this piece is “What would a replacement-level hitter’s batting line look like?" First a few caveats, though. As Tango has pointed out, there really is no such thing as a replacement-level hitter. A replacement player is a replacement player because his overall contribution, offense and defense, is at a level so as to have no marginal value. Thinking about it in terms of a replacement level hitter only confuses the issue.

However, the analytical structure of assuming that replacement level players are average in the field, and thus calculating their value as their offensive contribution above a “replacement” performer specific to their position plus their defensive contribution above an average performer at their position can be a useful approximation. It is the structure used by a number of approaches, including Pete Palmer’s TPR (which is above average, but the same principle holds), Keith Woolner’s VORP, and the RAR figures I post here at the end of each season. I am not claiming that this approach is optimal or superior to the others, only that if applied with caution it can be a useful model of player value.

For the sake of this post, let’s just assume that we are going to use a model where a replacement level player hits at some percentage of the league average. Then, if we’d like to know what his batting line might look like, we can use the component deflator approach. The good thing is that we don’t have to worry so much about the fact that we have static linear weights, since we are now applying the process to individuals for whom we’d like to hold the weights constant (ignoring the Theoretical Team arguments). So that caveat is loosened in this application.

Of course, this approach carries some of its own caveats with it: one is that we are again developing a model in which all events are equally deflated. It might actually be that replacement level hitters tend to not be as deficient in BA as one might expect. Or maybe teams are willing to trade BA for power in a replacement level hitter. This is a specific model with specific assumptions, and it is not necessarily reality.

Anyway, if we define R as the percentage of league average (or positional average or anything else if you’d like), then we can just plug it into one of the formulas from last time, and carry out the rest of the calculations as explained in that post:

New LWR = ((LWR/s' + x)*R - x)*s'

In my RAR estimates, I assume that a replacement player’s R/O is 73% of the positional average, where the positional average is figured by taking the overall league average times a long-term offensive positional adjustment. The positional adjustments I use are (note: you can tell how long ago I wrote this by the use of 2008 league totals):

C = .89, 1B/DH = 1.19, 2B = .93, 3B = 1.01, SS = .86, LF/RF = 1.12, CF = 1.02

Combining these adjustments, the LWR component deflator procedure, and the overall 2008 MLB offensive averages, here is the offensive output expected from a replacement player at each position:



How do these numbers look to you? My impression is that the batting averages are too low; teams may resort to replacement level players at 73% of league R/O, but they may be those that trade secondary skills for BA points of equivalent value. (assuming that players of this profile even exist in reality)

Anyway, you don’t have to take any of this too seriously, and I’ve already stated that the assumptions and admitted they may not model reality, so I’m not going to spend too much time justifying the results. Instead, I have another potentially amusing if not completely realistic application.

Namely, it is to take the initial statistics of a real hitter, and maintaining the proportional relationships between his positive events, projecting what his line would look like at a different level of productivity. For example, what would a replacement level hitter with Barry Bonds’ bizarre 2004 proportional relationships look like?

In this case, I’ll assume that a replacement player would have a 3.50 RG. Bonds’ 2004 line comes out to a 18.26 RG, so our “R” will be 3.5/18.26 = 19.2%. This line results:



The bizarro Bonds would hit just .140, but would still manage to put up a .416 secondary average. Of course, such a player would never really exist, but if he did, his offensive value would be about the same as the other replacement level guys above.

Let’s look at Tony Gwynn, 1994 to see what this would look like for a very good singles-type hitter:




And we could try going the other way. What would Mario Mendoza, superstar look like? Here’s the transformation from Mendoza’ career line to a 8 RG:



In order to turn Mendoza’s no-secondary skills profile into an all-time upper echelon great, you have to allow him to hit .400, and increase all of his positive rates by 81%.

This translation approach falls squarely under the category of "toy"; please don’t get the impression that I’m elevating it to any greater pedestal.

Monday, April 25, 2016

LWR and Component Deflators

If I tell you that a hitter has a line of .260/.330/.400, and I tell you that he plays in a park that inflates run scoring by 5%, what would you estimate his batting line would be in a neutral park? Suppose we know nothing at all about the park other than its effect on runs scored; we don’t know how it changes the rates of home runs, or strikeouts, or hits, or any other statistical category. We don’t know the dimensions or the altitude or the fence height or the type of playing surface, so we can’t make any estimates based on those factors either. How are you going to answer this?

One path you could take is to assume that the park influences the rates of all events equally. You could assume that the park will increase the number of singles by X%, the number of walks by X%, the number of home runs by X%, etc. Will this necessarily be the best approach? No; after all we expect the maximum real world park factor for walks to be much less extreme than for home runs, for instance. It may be an acceptable approximation, but it would be a stretch to say it was the optimum approach

However, what if we are only concerned about value, and don’t wish to adjust individual components? Personally, this is the question that interests me the most. I don’t care if the park benefits a certain type of hitter more than another; I just want to know what the effect was on runs scored. If a batter’s style or approach enables him to take more advantage of a park than the average batter, I don’t wish to take that credit away from him, as it generates real value for his team.

So if I want to adjust a player’s slash line for park, I don’t care about the precise park effect on doubles, walks, or strikeouts. I want to answer “What batting line would provide equivalent value in a neutral park, assuming that the proportional relationships between the components of this man’s line were a constant?” In other words, if this batter had a 2:1 ratio of singles to extra base hits in reality, I want him to still have a 2:1 ratio when we’ve adjusted his line for park.

This is where the approach mentioned above comes into play; assume the park has an equal effect on each mutually exclusive component of the batting line, and go from there. In order to start this process, we will make the assumption that the linear weight values will remain constant as we move between environments. Obviously this is a faulty assumption; linear weight values are dependent on context. However, linear weight values are also fairly stable over similar run environments, particularly in the case of the out value when we are using the -.1 type. A park increasing run scoring by 5% shouldn’t have too dramatic of an effect on the coefficients so as to render our conclusions invalid. Nonetheless, for more extreme parks, the potential for problems will be larger.

Let us define a new variable, a. a is what I will call the “component deflator” (I am borrowing the term “deflator” from Stephen Tomlinson’s “run deflator” as defined in the Big Bad Baseball Annual). Assuming stable linear weight values, using the definition of terms from the last post, and limiting the scope of our categories to the basic mutually exclusive offensive categories (singles, doubles, triples, homers, walks, batting outs) we can start by saying that:

RC/PF = new RC

New RC/Out = (sS*a + dD*a + tT*a + hHR*a + wW*a + x*(1 - S*a - D*a - T*a - HR*a - W*a))/(1 - S*a - D*a - T*a - HR*a - W*a)

All we have done is assume that the frequency of each event will be equally effected, by a factor of “a”. We can also simplify the out term to be 1 - a*(S + D + T + HR + W).

Just as seen in the last installment, we can cancel out the out terms from the numerator and the denominator, and thus save ourselves a lot of hassle, and write everything in terms of Linear Weight Ratio:

New LWR = (S*a + d'D*a + t'T*a + h'HR*a + w'W*a)/(1 - a*(S + D + T + HR + W)

Where new LWR = (New RC/O - x)*s'.

I have kept symbols everywhere to keep this as general as possible, but let’s remember what they actually are. d', t', etc. are all known values, all fixed coefficients. S, D, T, etc. are simply the frequencies of a set of mutually exclusive events. The only unknown variable in this equation is a, the common deflator.

Occasionally I find it necessary to include a disclaimer that I am not a mathematician, and this is one of those times. The way I am going to describe solving for severely overcomplicates the matter and makes the connection between LWR and a seem tenuous at best. And it’s true, you don’t need to convert to LWR in order to do this type of approximation; I just like doing in that way because of the aforementioned canceling out of the out term in the RC/O numerator and denominator.

To solve for a, let’s define the LWR numerator as N:

N = S + d'D + t'T + h'HR + w'W

One way to look at this is that, in effect, we have stated the player’s positive linear weight contributions from all events as an equivalent number of singles, since singles have a weight of one.

Let’s also write the denominator in an equivalent number of singles. Find (S + D + T + HR + W)/S and call this D (sorry for doubling up with doubles here). This the ratio of all non-out PA outcomes to singles.

Then, the New LWR can be viewed as this:

New LWR = N*a/(1 - D*S*a)

We have reduced all of the events down to an equivalent number of singles, and can solve for the ratio of singles under our new conditions to singles under old conditions that result in the desired new LWR. This is a, and it is the same ratio that will apply to the other events (except outs, which have to be handled differently):

a = New LWR/(N + D*S*New LWR)

Now, our player’s new rate of singles will be S*a. His new rate of doubles will be D*a, his new rate of home runs will be HR*a, and so on for all events except outs. His new rate of outs will be 1 - S*a - D*a - T*a - HR*a - W*a, or substitute “PA” for “1” if you are using the actual count of each event rather than the per-PA frequencies. The outs can also be adjusted as 1 - a*(1 - Outs), or PA - a*(PA - Outs), depending on whether you are using frequencies or counts.

Those of you who are astute and who are not totally bewildered by the circuitous way I defined terms and got to this point (which should eliminate most of you, since if you are in fact astute you are rightfully thinking “What the heck is wrong with this guy?”) may notice that the execution here is similar to Bill James’ “Willie Davis method”. And that it is. James converts a player’s batting line into an equivalent number of singles, finds the proportion of translated singles to original singles necessary to yield the right new number of Runs Created (which involves the quadratic formula due to the nature of the RC formula), and adjust the other events accordingly. So the procedure I’m using here is not in anyway new or unique, it is just an application of it in the case of Linear Weights.

Let me walk you through an example, since I’ve made this confusing as all get out. Let’s review the ERP-based LWR that I derived last time for example purposes:

LWR = (S + 1.67D + 2.33T + 3HR + .67W)/(1 - S - D - T - HR - W)

Let’s suppose that we want to take a league-average player from the 1990 NL and project his statistics in an extreme park, a mid-90s Coors type park with a 1.20 PF. Here are his statistics:



With some basic algebraic manipulations on the equations in the last post, we can go directly from LWR to New LWR by this formula:

New LWR = ((LWR/s' + x)*adjustment - x)*s'

Where we recall that x is the linear weight value of an out (-.097 in this case), s' is the reciprocal of the linear weight value of a single (2.058 in this case), and adjustment is the scalar effect on runs/out (1.20 in this case). So:

New LWR = ((.545/2.058 - .097)*1.2 + .097)*2.058 = .614

From here, we need to find “N” and “D”:

N = S + 1.67D + 2.33T + 3HR + .67W = .167 + 1.67(.041) + 2.33(.006) + 3(.021) + .67(.086) = .370

D = (S + D + T + HR + W)/S = (.167 + .041 + .006 + .021 + .086)/.167 = 1.922

And now we can solve for a:

a = New LWR/(N + D*S*New LWR) = .614/(.370 + 1.922*.167*.614) = 1.083

In order to increase this player’s RC/O by 20%, we need to increase his singles, doubles, triples, homers, and walks by 8.3% each. This yields a new batting line of:



So this player has gone from hitting .257/.321/.384 to hitting .281/.348/.419. His park-adjusted value has been held constant, as have his relative frequencies of each positive PA outcome. The key is that he has more of all the positive events, and thus less outs.

In case you are curious, from the limited set of frequencies defined here, BA is (S + D + T + HR)/(1 - W); OBA is S + D + T + HR + W; and SLG is (S + 2D + 3T + 4HR)/(1 - W)

Monday, April 11, 2016

Linear Weight Ratio

Note: The series of four posts I will be posting over the next month were written a long time ago, apparently in 2009. Since I have not been prolific in producing new material lately, I figured I might as well post some older stuff I’ve written that at the time I didn’t deem good or interesting enough to post. I did not vet all of the material in them, so any inaccuracies are my fault but do not necessarily reflect my current thinking.

Linear Weights Ratio (LWR) is an offensive metric developed by Tango Tiger, based on Linear Weights. Since it was developed and explained by Tango, there is really no need for me to step in and write a post that may just serve to confuse you. And I have not defined everything in exactly the same way he did, which will only add to the confusion.

I have always liked to write descriptions of other people’s research, for a couple of reasons. One is as a sort of critique/peer review, which does not have to be critical--it can also point out the positives about an approach. A second is so that if I use something later (and I have an upcoming post that uses LWR in the vein of Bill James' "Willie Davis method"), my readers can have some degree of confidence that I understand the topic at hand. All too often you will see people use metrics that they don’t really understand. By writing about the ones I’m using, I will be presenting you with sufficient evidence to draw your own conclusions as to whether or not I understand the tools I am using.

Let’s begin by focusing only on the basic, mutually exclusive offensive events: singles, doubles, triples, home runs, walks, and batting outs (AB - H). For now, we will assume that those categories encompass every possible outcome of a plate appearance. Let us also assume that we have some set of linear weights which give the value of each of those events: s is the value of a single, d of a double, t of a triple, h of a home run, w of a walk, and x of an out. Additionally, I am approaching this problem with absolute (total runs scored) weights, so x is something like -.1, not -.3. Tango’s LWR used the -.3 type value.

Given those assumptions, we can of course write:

RC = sS + dD + tT + hHR + wW + xO

Let’s consider “S”, “D”, etc. to be per PA frequencies (again, these events are assumed to encompass all possible PA outcomes, so PA = S + D + T + HR + W + O). If that is the case, we can rewrite O as 1 - S - D - T - HR - W, and write an expression for RC/Out:

RC/O = (sS + dD + tT + hHR + wW + x(1 - S - D - T - HR - W))/(1 - S - D - T - HR - W)

The out term can be canceled out, leaving us with:

RC/O = (sS + dD + tT + hHR + wW)/(1 - S - D - T - HR - W) + x

You can see that there is no need for the out term to be included at all; we are still implicitly including outs, but we don’t need to include them in the equation. The numerator of the expression is the run contribution of each event, excluding outs, while the denominator is outs. This is what I will call rLWR, for run LWR:

rLWR = (sS + dD + tT + hHR + wW)/(1 - S - D - T - HR - W)

In figuring his Linear Weight Ratio, Tango adds an additional wrinkle, and sets the weight of a single equal to 1, with the other weights changing proportionally. We can define s' as 1/s, and use that to define d' = d*s', t' = t*s', etc., and write LWR as:

LWR = (S + d'*D + t'*T + h'*HR + w'*W)/(1 - S - D - T - HR - W)

At this point I’ll plug in some actual numbers from the basic ERP equation I use ((TB + .8H + W - .3AB)*.324). This is not an optimal equation, and that’s okay because my point here is not to present a formula that you should use, just to demonstrate how you can derive your own formula for LWR based on whatever set of linear weights you are using. When that ERP equation is expanded, it becomes:

ERP = .486S + .810D + 1.134T + 1.458HR + .324W - .097(AB - H)

Which yields s’ = 2.058 and the following LWR equation:

LWR = (S + 1.67D + 2.33T + 3HR + .67W)/(1 - S - D - T - HR - W)

If you are using the actual counts of each event rather than the per PA frequencies, this could be written the same except PA would replace 1 in the denominator.

It is easy to convert between LWR and R/O, and it is a linear process. The equations are:

R/O = rLWR + x
rLWR = R/O - x
R/O = LWR/s' + x
LWR = (R/O - x)*s'

What alterations do we have to make to include non-batting outs in our ratio? This can be tricky since we can no longer assume a uniform value for outs across types. But we just need to ensure that the above relationships still hold, and weight the event in the numerator accordingly. (LWR*s' + x)*Outs must equal RC. We can expand that out:

(LWR numerator/Outs*s' + x)*Outs = RC

which simplifies to:

LWR numerator*s' + x*Outs = RC

For any specific event, x is known (the -.097 value), Outs is known (each out is worth one out), s' is known (2.06 in this case), and the RC weight of the event in question in known (let’s say we have CS at an overall value of -.3), so all we need to do is solve for the needed coefficient in the LWR numerator:

(RC weight - x)/s' = LWR numerator

For the CS example:

(-.3 - (-.097))/2.06 ~ = -.1

Friday, April 01, 2016

2016 Predictions

Standard disclaimer applies. Also, I’m giving myself an extra Oreo for every time I can use the phrase "on paper".

AL EAST

1. Boston
2. Toronto (wildcard)
3. New York
4. Tampa Bay
5. Baltimore

I’ve picked the Red Sox to win the AL East in 2015, 2012, 2011, 2009, and 2007 and to win the pennant in 2015, 2012, 2011, 2009, and 2007. I was right in 2007; in 2013, when they won another division and pennant, I picked them to finish third. I guess what I’m trying to say is "David Price, if you’re reading this, don’t put in a pre-order on a duck boat."

To attempt to analyze why I have been so wrong about the Red Sox so frequently would be taking this exercise more seriously than I intend, and would be about me and not baseball, which is of no interest to anyone other than me. So I will leave any questions about whether I qualify for the popular definition of insanity to the reader and instead point out that the crude infrastructure I use to inform these predictions really left me with no choice; I have Boston six wins ahead of anyone else in the AL. Only Toronto projects to score more runs and only Cleveland and New York project to allow fewer. One would think they would have fewer disaster positions and a stronger rotation than in 2015.

At first I was surprised to see Toronto still ranked highly, which is a testament to how unless one reasons this out on paper, it would be easy to overreact to losing a rental pitcher who was only there for two months and forget that one picked them second last year as well and there’s little reason to be more bearish on the team now. Of course, reasoning this out on paper is what leads me to pick the Red Sox all the time.

New York should be right in the mix for the wildcard; if Tanaka and Pineda can somehow stay healthy, they have a sneaky good rotation. I’m not feeling the Tampa Bay love, as their rotation has multiple question marks and their offense is lacking (I don’t think one can count on even a healthy Evan Longoria being a star-level performer). Baltimore should serve as a warning as to how quickly special pleading about outperform Pythagorean and winning one-run games and the like can be forgotten when the team has a bad year. They’re not the cool kids any longer, those guys are in the next division…

AL CENTRAL

1. Detroit
2. Kansas City
3. Cleveland
4. Chicago
5. Minnesota

Everyone, including me, will tell you about how little there is separating most of the AL teams paper. Since I suspected this would be the case before I even sat down to put anything on paper, I decided that I would pick the AL in exactly the order my numerical exercise suggested with one exception--should Cleveland be in playoff position, I would drop them out of it. With the exception (discussed below) of the second wildcard, that is exactly what I have done.

I wouldn’t be surprised if I pick the 2017 Tigers to finish last, but I think they did enough patching this season that a dead cat bounce (see what I did there?) may be in the offing. You can just picture them getting off to a slow start and roaring (I’ll stop now) back behind interim manager Dave Clark or whoever.

Then there are the Royals. On paper I have them with 80 wins, just ahead of the White Sox, but I’ll dutifully jump them over Cleveland just the same. They have outplayed their PW% (W% estimated from runs created and allowed) by 19 games over the past two seasons, which is the seventh-highest total since 2003 (I have figured PW% for my end of season stats back to 2003, not always by the exact same method but this is a case in which the concept is much more important than the specific implementation of it). Their predecessors have generally done well in outplaying PW% again in year 3:



The average year 3 out-performance is 3.7 wins, so let’s be generous and give the Royals four more wins (the sabermetrically-sharp among you probably noticed that this very crude and unendorsed methodology is assuming that these teams’ RC and RC Allowed are consistent with pre-season expectations). That puts them at 84; I have Detroit with 84 and Cleveland with 83.

Does that placate you if you think last year was “the year that Base Runs failed”? Setting aside the ridiculous nature of hanging errors which are created jointly by Base Runs and Pythagenpat solely around the neck of the former, of course one must objectively acknowledge that PW%, whatever reasonable inputs one might use, had a bad year in 2015. A really bad year. The chart gives the RMSE of (W% - PW%) from 2003-2015:



A RMSE of equivalent to 6.66 per 162 games was by far the worst over this period. But note that the previous two seasons were the best over this time period, and that the overall trend, if one can divine one, appears to be stable or improving accuracy over time. So which do you believe--that there’s a possibility worth multiple blog posts about that suddenly, in 2015, all of the underpinnings of run and win estimation and the combination thereof suddenly ceased to work? Or that sometimes the dice roll a little bit differently?

The general discussion of PW% is not specific to Kansas City, of course; the Royals could this season once again outplay their PW% even if the league-wide error returns to normal levels. But if you feel compelled to hedge late in your post by writing the phrase “this is probably just a blip”, it almost certainly is a blip.

The Indians are my team, which is why I won’t pick them to make the playoffs unless I’m really feeling it in addition to seeing it in the objective projections (almost the opposite of how I approached picking the Indians as a younger human being, in which case feeling was the only thing that mattered). The fact that they couldn’t even do something like bring in Austin Jackson for $5 million to help shore up a dreadful looking outfield prevents me from believing that this is their year. Seriously, the opening day outfield is Rajai Davis, Tyler Naquin, and Marlon Byrd backed up by Colin Cowgill. Send money. The White Sox could be in the mix, but I still see a below-average offense with good but not great starters and a mediocre bullpen, even if I really like Carlos Rodon. The Twins certainly have some offensive players to watch, but their multi-season run of bad starting pitching doesn’t seem to be coming to end this year.

AL WEST

1. Houston
2. Seattle (wildcard)
3. Texas
4. Los Angeles
5. Oakland

On paper, the Astros and A’s stand out from the pack in this division; the other three look to be pretty close to me. If I was reading this post, I would stop here, because I have vowed to stop reading any baseball article that uses the term “tank” (all apologies to Dayan Viciedo). But it seems to me that much of the alarmism about the imagined problem of “tanking” stems from the interests and fans of rich teams. Fans of these teams, which could never allow themselves to take a clear step backward to the extent that the Astros or Cubs did, don’t seem to appreciate it when opponents try different approaches to build lasting contenders rather than simply throwing money around trying to reach 85 wins and perpetually hunt for a wildcard berth. I can’t blame them--it would be nice to be a fan of a league in which any clubs that can’t match your financial advantage are forever stuck in the middle. But the easiest thing in the world is to be a fan of a rich team and chastise other teams for winning 60 games every once in a while.

I’m not really sold on the Mariners as a wildcard team, but I have a general rule against picking two wildcards from the same division (even though this is quite possible as the NL Central demonstrated last year) and so I’m not picking the Yankees. On paper, I have the Mariners and the Royals virtually tied, with a slight edge to the former. I can talk myself into believing it--most of the reasons why I and many others liked them last year are still in place, with Jerry Dipoto seemingly doing a nice job of tinkering on the edges of the roster. The same is true of the Rangers, but in the opposite direction. Yes, they now have Hamels, we know Prince Fielder is still alive, and Darvish should be back at some point, but there’s still a reason they were picked last by many in 2015. The Angels are unintentionally going for a stars and scrubs approach, but they only have one star. That he’s the brightest in the firmament is still not enough to make that a winning strategy. The A’s should have been much better last year, but this year may be closer to their actual 2015 record than their predicted one.

NL EAST

1. Washington
2. New York (wildcard)
3. Miami
4. Atlanta
5. Philadelphia

I dislike Dusty Baker’s managing as much as the next guy, probably a bit more since I saw a fair deal of him when he was in Cincinnati. But damned if Matt Williams isn’t one of a very small number of major league managers that I think I’d be willing to replace with Dusty. A player’s manager following the unpopular Williams couldn’t hurt either.

But I don’t think you need to resort to pop psychology in order to think the Nationals are the team to beat in the East. While their roster is not as good on paper as it was last year, they are likely to stay healthier. Even with regression from Harper, significant contributions from Rendon, Ramos, even Daniel Murphy could make this a more productive offense. Their rotation is not Mets-level but it should still be good enough, although the bullpen doesn’t look great. Second half surge and Cespedes resign aside, I see the Mets as an average offense. The most comparable current team is closer to the Indians than to the Cubs. If things go right for the Marlins, this could be their wildcard and World Series year, but thankfully that is often true yet it still has only happened twice. I was surprised that I still had the Braves five games ahead of the Phillies on paper for 2016, although I definitely would take their next five years over the Phillies as well.

NL CENTRAL

1. Chicago
2. Pittsburgh
3. St. Louis
4. Cincinnati
5. Milwaukee

My crude estimates have the Cubs at 96 wins, which is one of the highest figures I can remember. They easily have the best offense in the league on paper, while allowing the same number of runs as the Mets. They are really good, which means they might have a 15% chance of being recognized as such when it’s all over and an 85% chance of being cited as another sad chapter, 1908, 1945 blahblahblah. I have the Cardinals and the Pirates as dead even on paper, both a step behind the second-place teams on the coasts in the wildcard hunt, St. Louis with better defense and Pittsburgh with better offense. If you believe in Searage magic, that may be reason to go with the latter; I learned today that Cory Luebke made the Pirates pen and I flipped them from how I’d originally written this. Scientific process right here. As with the NL East, I was surprised that I have the Reds five games ahead of the Brewers. As with the NL East, I don’t think it matters much one way or the other.

NL WEST

1. San Francisco
2. Los Angeles (wildcard)
3. Arizona
4. Colorado
5. San Diego

On paper I have the Dodgers six games ahead of the Giants, so it was probably a foolhardy move to flip them here. But the Dodgers have some serious questions about the health of their (otherwise very solid) rotation and enough nagging injuries to position players that I’m leaning Giants. That might be just as well for the Dodgers if they could supplement from their wealth throughout the season and flip the 2014 script on their rivals. Another reason I’m ignoring that six game gap is my number have San Francisco with an average offense and I expect the team that led the majors in park-adjusted OBA last season will retain a little more production than that (even if I too am skeptical of Matt Duffy, Brandon Crawford, Joe Panek: Super Infield!). A lot of the mainstream prognostications I’ve encountered have stated as a given that the Diamondbacks have an excellent offense and just needed to shore up their pitching. But while they were third in the NL in park-adjusted RC/G in 2015, the two teams they trailed by .34 and .14 runs respectively are the two I’m picking ahead of them in the NL West. I guess maybe people really believe in Patrick Corbin’s elbow and Robbie Ray? This makes it three-for-three NL divisions where I’m surprised to have the fourth-place team so far ahead of the last place team on paper (and, except in the case of the East, surprised to have them ahead at all). But I’ll stick with the Rockies over the Padres despite my misgivings.

WORLD SERIES

Chicago (N) over Boston

Just twelve years ago, such a World Series matchup would have conjured up mixed emotions and platitudes of “at least one of them will finally get to win”. This year, everyone outside of New England and St. Louis would be united in singing “Go Cubs Go”. Remember how interesting this could have been when it’s actually Royals/Marlins or something disgusting.

AL Rookie of the Year: 1B AJ Reed, HOU
AL Cy Young: Carlos Carrasco, CLE
AL MVP: OF Mookie Betts, BOS
NL Rookie of the Year: SS Trevor Story, COL
NL Cy Young: Stephen Strasburg, WAS
NL MVP: C Buster Posey, SF

Wednesday, March 09, 2016

1883 AA

The American Association entered its second season having gotten the attention of the National League if nothing else. Some AA clubs began to sign some NL players away, and as the AA did not have a reserve clause itself, players jumped between association clubs. Some blacklisted NL players (most notably Charley Jones) were signed to AA contracts.

On February 17, NL president Abraham Mills and AA representative O.P. Caylor met in New York and agreed upon a truce of sorts. Each league agreed to a reserve limit of eleven players per team, to honor contracts and blacklists of the other circuit, and to allow exhibitions between NL and AA clubs.

The AA followed the NL’s lead (or perhaps it was the other way) in expanding the schedule to 98 games and in forming an eight team circuit. The New York Metropolitans, a strong independent club that had been flirting with the two majors for a couple of years joined, as did Columbus’ first major league team.

The AA even followed the NL’s lead in having a two city Memorial Day doubleheader, as the Reds lost 1-0 to the Mets, then beat the Athletics 10-8 in eleven innings. Other noteworthy happenings were Columbus’ 25-10 thrashing of the Alleghenys on June 13 in which they scored in every inning; Cincinnati’s 23-0 rout of Baltimore on July 6; and Cincinnati’s John Reilly hitting for the cycle on September 12 in a 27-5 win over the Alleghenys, then cycling again one week later.

The AA had its first great pennant race, as defending champ Cincinnati, second place Philadelphia, and upstart St. Louis went at it. On September 6, the Athletics completed a three game sweep of the Browns that maintained their lead. With one week to go, they led by 2.5, but promptly lost a pair to Louisville as St. Louis took care of the Alleghenys. On September 28, they were finally able to clinch when they defeated the Eclipse 7-6 in ten innings on Guy Hecker’s wild pitch. The final margin was just one game over St. Louis, with Cincinnati five back.

After the season, the two pennant winners (Boston and Philadelphia) were going to face each other, but the Athletics dropped two of three to their hapless NL citymates and decided the better of it. While the leagues would not meet on the field, they did meet in a negotiating room. The National League and the minor Northwestern League were set to formalize a pact governing their relationship when NL president Mills reluctantly suggested bringing the AA to the table as well, as the agreement would not be particularly valuable if the other major league was not party to it.

The AA did come to the table, and on October 27, the three circuits agreed to what was called the Tripartite Agreement. It formalized the relationships between the leagues and defined the club’s territorial rights. With this agreement, the concept of Organized Baseball as an entity larger than even the major leagues themselves was established for the first time. The Tripartite Agreement would later be expanded to include other leagues and become known as the National Agreement.

Just as the NL and the AA had come to terms, a new challenge was on the horizon, and the strength of their tenuous peace would be put to the test, as would the strength of the circuits themselves.

STANDINGS



The Reds topped the league in EW% and their margin "should" have been around eight games, but instead they finished five back. Meanwhile, Philadelphia and St. Louis were even closer in EW% than they were on the field, and they were only one game apart on the field. New York was a first division team right off the bat, and Columbus was respectable. Baltimore was at least within hailing distance of another club, but that was more due to serious Allegheny decline than any Oriole leap forward.

In 1883, the AA hit .252/.282/.331 for a .121 SEC, 5.72 runs and 24.01 outs per game.

PHILADELPHIA



The Athletics copped the city’s second pennant, as the Athletics (different club) had captured the inaugural National Association title in 1871. The team loaded up with talent from the NL: Stovey and Corey from defunct Worcester, Bradley from Cleveland, Mathews from Boston, and Knight from Detroit.

The team had an interesting third base/pitcher platoon, as George Bradley and Fred Corey played 44 and 39 games there respectively and were second and third on the team in innings. Ace Bobby Mathews was a weak hitter (44 ARG) and generally either pitched or rode the bench.

Shortstop Mike Moynahan posted the team’s highest WAR. According to Jim Charlton’s chronology, on May 27, 1882, Moynahan broke his finger against the Metropolitans and had it amputated at the first joint. At the time, Moynahan was playing for the Philadelphias, the Al Reach-backed independent club that had been passed over for AA membership in favor of the Athletics (thanks to Richard Hershberger for confirming that Moynahan played for the Philadelphias in 1882).

Rookie Jack Jones (5-2 in 65 innings, 6-5 in 92 innings for Detroit earlier in the season) won the pennant-clinching game for the Athletics. He left the game after the season to pursue dental school.

According to the The Ball Clubs, a Philadelphia zookeeper thought up the idea of having carrier pigeons relay the score to other points in the city at the conclusion of each inning, and implemented this scheme at least on a limited basis.

ST. LOUIS



The Browns turned over more than half of their regulars in climbing from fifth to second. The new catching platoon of rookie Tom Dolan and Pat Deasley each came over from the NL (BUF and BSN respectively). George Streif came from the Alleghenys, rookie Arlie Latham had played with Buffalo in 1880, Tom Mansell had played for Troy in 1879 and then for Detroit in the beginning of 1883, Fred Lewis had played for Boston in 1881, Hugh Nicol came from Chicago, and Tony Mullane was whisked away from Louisville.

Ted Sullivan started as manager, but was replaced by Charlie Comiskey despite boasting a 53-26 record. The second-year first baseman also turned in a much improved season at the plate.

When Mansell was with Detroit this season, he posted an ARG of just 77 in 139 PA. With the Browns, he batted .402 in 119 PA.

CINCINNATI



The Reds did not change their personnel much from their pennant winning 1882 campaign. John Reilly, who had last played in a major league with the old Reds in 1880, took over at first base and emerged as the team’s top hitter. Charley Jones, last with the Boston Reds in 1880 due to being blacklisted, was 33 but vied with Reilly for the team lead in WAR. Rookie Pop Corkhill was the right fielder, and rookie Ren Deagle pitched well.

The Reds apparently believed that they had an agreement with hometown catcher Buck Ewing, then with Troy. If they actually did, Ewing reneged, signing instead with the NL’s New York entry.

NEW YORK



The Mets were controlled by John Day and Jim Mutrie, who also backed the NL’s Gothams. Day was a tobacconist and his independent club had been founded in 1880 and developed into a strong outfit (according to Cliff Blau, they were 101-58-3 overall in 1882 and 5-1 against AA opponents). Mutrie managed the Mets, but that didn't distill the appearance that they were playing second fiddle. Both teams played at the Polo Grounds, but a canvas fence separated their fields, and the Gothams drew ritzier crowds (the twenty-five cent admission price gap between the two circuits was of course a factor in that). The AA demanded on threat of expulsion that the team have its own park for 1884.

The last big league stops of the players were: Holbert (TRO), Crane (BUF, 1880), Esterbrook (CLE), Nelson (WOR, 1881), O’Rourke (BSN, 1880), Roseman (TRO), Keefe (TRO), and Lynch (BUF, 1881). Lynch, Brady, Nelson, and Kennedy were holdovers from the 1882 roster.

Bill James claims that Keefe had developed “what has been described as” the first modern changeup and began using it this season. Sam Crane was arrested for running off with Hattie Tavenfelter, the wife of a Scranton fruit dealer, and $1,500 of his savings.

The Mets let heavyweight champ John Sullivan pitch in two exhibition games; he later lost his title to John Corbett in 1892. The Reach Guide (quoted in Bill James’ New Historical Baseball Abstract) wrote: “a number of ball-players lost heavily on the Corbett-Sullivan prize fight. It was a notable fact that five out of every six ball-players were ardent believers in Sullivan’s superior powers.”

John O’Rourke was playing in his final big league season (he played for Boston in 1879 and 1880) despite hitting very well yet again. I could not find any secondary sources that discussed him in detail, so I asked for leads on SABR-L. The information that follows is based on Frank Vaccaro's research, which he shared on the list.

O'Rourke's pro career started in 1877 with the Mansfields of the International Association. In 1878 he hit .376, leading the IA, and signed with Boston after the season. In 1881 he reemerged with the Philadelphia Athletics of the Eastern Championship Association (who would become an inaugural AA member the following season). After playing for the Mets in 1883, he never appeared in the majors again. Vaccaro cites an April 6, 1884 report in the Cincinnati Enquirer that O'Rourke was working as a baggage master for an eastern railroad and was not playing ball because he demanded to be paid the value of "a small city, with town hall and other public buildings".

O'Rourke was apparently considered something of an oaf in comparison to his famous older brother, but it seems as if his real job was the reason that his major league career ended after 1883, not any on-field deficiency.

LOUISVILLE



The Eclipse improved their winning percentage, but dropped out of the first division and slipped two spots in the standings in the expanded eight team association; the defection of ace Tony Mullane to St. Louis was certainly a major factor. He was replaced by the Athletics’ Sam Weaver. The infield was redone, with Joe Gerhardt (last with Detroit, 1881), Jack Gleason (St. Louis), Jack Leary (Alleghenys), and rookie Tom McLaughlin playing significant roles. Catcher Ed Whiting was brought in from the Orioles.

Chris Von der Ahe, owner of the Browns, apparently offered each of the Louisville players a new suit if they could sweep the Athletics late in the season. They won the first two, but Guy Hecker’s wild pitch cost the Browns the pennant and the Eclipse some new threads.

COLUMBUS



The Colts were my favorite city’s first major league outfit. They were managed by Horace Phillips, the former Athletics manager who had sought to organize the AA before being fired by his players. The team featured four rookies: Field, Kuehne, Valentine, and pitcher Ed Dundon, the majors’ first deaf player.

The rest of the regulars had all played in 1882: Kemmler (PIT), Smith (LOU), Richmond (CLE), Wheeler (CIN), Mann (PHA), Brown (BAL), and Mountain (WOR).

PITTSBURGH



Despite playing 21 more games, the Alleghenys came up eight wins short of their 1882 output. Their field also partially flooded in June, which helped facilitate Columbus’ rout described earlier; the soggy conditions hampered outfielders.

John Peters fell off the cliff at age 33, batting just 28 times. He would have an even shorter stint with the team in 1884 and depart the scene for good. Peters was replaced at short by Louisville’s Denny Mack, pretty much a replacement-level player. Jackie Hayes (Worcester) took over primary catching duties as Billy Taylor moved to right, filling the position vacated by Ed Swartwood, who moved to first. George Creamer, also last with Worcester, took over at second; Buttercup Dickerson, last with the same late NL franchise in 1881, played center. Two rookie pitchers, Bob Barr (not the presidential candidate) and Jack Neagle, who also spent time with the Phillies and the Orioles in 1883, filled the innings left available by Harry Salisbury’s departure (Salisbury never pitched in the majors again despite being just 28 and having been above average in 1882).

While they weren’t good, this outfit was certainly interesting. David Nemec describes them as “boozing, brawling, bad-ass”. Billy Taylor, Mike Mansell, and George Creamer were suspended August 20 for drunkenness after a game with Louisville. In November, Taylor married a female baseball player despite apparently being warned that the relationship was not destined for the fairy tales. Shortly thereafter, he was arrested after allegedly robbing her ex-boyfriend.

BALTIMORE



The Orioles, as best as I can tell, were the first team ever to turn over their entire group of regulars in a single season. Cliff Blau has unearthed that this was a different organization than the Baltimore AA club of 1882, and the player turnover is an indication of that.

It didn’t really help much (although some of it was probably not by choice), as they improved from .260 to .292. As with the NL Phillies, Baltimore used an abnormally large number of players (29). The AA average was 18 and the Browns were next with 21.

The team's regulars with big league experience were: Kelly (CLE), Stearns (CIN), O’Brien (WOR), Manning (PRO), Say (LOU), Clinton (WOR), Eggler (BUF, 1879, and he returned to the Bisons mid-season), Rowe (CLE), Fox (BSN, 1879).

Leaders and trailers:

BATTING AVERAGE
1. Ed Swartwood, PIT (.357)
2. Pete Browning, LOU (.338)
3. Jim Clinton, BAL (.313)
Trailer: Hardie Henderson, BAL (.162)
Trailing non-pitcher: Dave Eggler, BAL (.188)
ON BASE AVERAGE
1. Ed Swartwood, PIT (.394)
2. Pete Browning, LOU (.378)
3. Mike Moynahan, PHA (.360)
Trailer: Dave Eggler, BAL (.192)
SLUGGING AVERAGE
1. Harry Stovey, PHA (.506)
2. John Reilly, CIN (.485)
3. Ed Swartwood, PIT (.476)
Trailer: Dave Eggler, BAL (.198)
SECONDARY AVERAGE
1. Harry Stovey, PHA (.266)
2. Charley Jones, CIN (.228)
3. Pop Smith, COL (.202)
Trailer: Dave Eggler, BAL (.015)
RUNS CREATED
1. Ed Swartwood, PIT (98)
2. Harry Stovey, PHA (96)
3. John Reilly, CIN (91)
4. Mike Moynahan, PHA (85)
5. Candy Nelson, NYA (83)
ARG
1. Candy Nelson, NYA (169)
2. Pete Browning, LOU (166)
3. Ed Swartwood, PIT (160)
4. John Reilly, CIN (151)
5. Charley Jones, CIN (149)
Trailer: Dave Eggler, BAL (40)
WAA
1. Candy Nelson, NYA (+3.6)
2. Ed Swartwood, PIT (+2.9)
3. Pete Browning, LOU (+2.9)
4. John Reilly, CIN (+2.8)
5. Charley Jones, CIN (+2.5)
Trailer: Joe Battin, PIT (-2.0)
WAR
1. Candy Nelson, NYA (+5.7)
2. Bill Gleason, STL (+4.4)
3. Pete Browning, LOU (+4.2)
4. Charley Jones, CIN (+4.0)
5. John Reilly, CIN (+4.0)
Trailer: Dave Eggler, BAL (-.9)
ARA
1. Will White, CIN (73)
2. Tim Keefe, NYA (76)
3. Bobby Mathews, PHA (82)
4. Jumbo McGinnis, STL (84)
5. George Bradley, PHA (84)
Trailer: Jack Neagle, PIT (151)
WAA
1. Will White, CIN (+5.0)
2. Tim Keefe, NYA (+4.6)
3. Bobby Mathews, PHA (+2.2)
4. Jumbo McGinnis, STL (+2.0)
5. Tony Mullane, STL (+1.6)
Trailer: Hardie Henderson, BAL (-2.9)
T WAR
1. Tim Keefe, NYA (+6.7)
2. Will White, CIN (+6.3)
3. Tony Mullane, STL (+3.7)
4. Guy Hecker, LOU (+2.4)
5. Jumbo McGinnis, STL (+2.3)
Trailer: Hardie Henderson, BAL (-2.9)

My all-star team:
C: Jack O’Brien, PHA
1B: John Reilly, CIN/Ed Swartwood, PIT
2B: Joe Gerhardt, LOU/Bid McPhee, CIN
3B: Hick Carpenter, CIN
SS: Candy Nelson, NYA
LF: Pete Browning, LOU
CF: Charley Jones, CIN
RF: Hugh Nicol, STL
P: Tim Keefe, NYA
P: Will White, CIN
P: Tony Mullane, STL
MVP: SS Candy Nelson, NYA
Rookie Hitter: 3B Arlie Latham, STL
Rookie Pitcher: Ren Deagle, CIN

There’s not enough to decide at first base; I have Reilly at +2.79 wins versus an average hitter, +3.95 versus a replacement level hitter. Swartwood is at +2.88 and +3.91, and each has -1 Fielding Runs by Pete Palmer’s method. That is simply too close to call, and the two decimal places are the very definition of false precision.

I also decided not to decide at second base. Gerhardt is .2 WAR behind McPhee as I figure it, but Palmer has him at +18 fielding while McPhee is only +4. However, the eighteen runs is out of line with most of Gerhardt’s career, and McPhee also has an excellent defensive reputation. You decide which figures you trust and to what extent.

Candy Nelson’s season is worth taking a longer look at. Nelson had begun his career at age 23 with Troy and the Eckfords of Brooklyn in the National Association. For 1873-75, he played for the Mutuals, then did not reappear in a major league until 1878 with Indianapolis, 1879 with Troy, and 1881 with Worcester. In 1882 he played for the Mets in their last season as an independent club. His best season in terms of Palmer’s Batting Runs was +4 in 1879, and he was never a regular in the NL, although he had been with the Mutuals.

The rest of this is a digression into the whys and wherefores of sabermetric methodology, and anyone who is primarily interested in the history part of this can feel free to stop reading. Palmer’s system is not crazy about his 1883 season; he comes in at +14 runs offensively, -7 in the field, for a TPR of .9. TPR picks John Richmond of Columbus as the AA’s top player on the strength of 27 Fielding Runs. Leaving that aside, the top batter in the Association is Swartwood at +41 runs.

By my way of figuring these things, Nelson is +34 runs offensively and Swartwood is +37. As you can see, we essentially agree about Swartwood, but diverge by twenty runs on the question of Nelson’s contributions. What could possibly cause such a large difference?

The answer is park factor. Palmer figures park factors for the nineteenth century; I do not. I have gone with the approach that Bill James used in his (original) Historical Baseball Abstract--use the runs scored and allowed in the player’s team’s games as the context in which to evaluate him. There is an argument to be made that this is ideal, but in this case, the pragmatic argument was all it took to convince me to use this approach. Ballparks in the nineteenth century were transient by today’s standards--they were always burning down, being rebuilt, etc. Even if there is a team with a stable park, the rest of the league is changing around it--not just the parks of the other teams, the identities of the teams themselves.

Park factors require some minimum degree of continuity to work properly. I do not believe that the nineteenth century meets the standard. Thus, if I wanted to figure PFs, I would have to do it on a single year basis. Then you have to deal with the fact that the sample data is being drawn from a 100 game season rather than a 162 game season. You would have to regress the PFs very heavily. In the end, I think it is a fool’s errand.

In some cases, it may cause distortions. Nelson’s Metropolitans allowed only 405 runs, best in the association--but they also scored just 498 runs, fifth in the association. Some of this may have been due to the park in which they played, but some of it may have just been great pitching (certainly Keefe was in fact a great pitcher) coupled with an average offense. So perhaps Nelson and his offensive teammates are getting an unfair break, being evaluated against an average expected contribution that is too low.

In fact, Palmer’s park factor for the 1883 Mets is 104, meaning a park that inflated seasonal totals by 4%. It is true that New York and their opponents averaged 9.72 runs in the 47 games played at New York and 8.92 runs in the 50 road games. So in this case, it may have been a lousy offense coupled with a great pitcher producing a low number of runs per game.

If I figure Nelson’s runs above average with a park factor of 1.04 coupled with the association average of 5.72 runs/game, he is +20, and thus Pete and I only disagree by six runs.