Friday, March 12, 2010

Fact or Faith?

I was a philosophy major in college. It was great - I had to write a lot of papers, but there usually wasn't a right answer I had to memorize or compute. Just write what you think, and support it. Pretty good prep for blogging, actually. The other thing I liked about it was the debates in class, and how open-minded the professors were (though I'm sure they'd heard all the same arguments from every class for God knows how many years). I came into college with a fairly open mind, and that experienced reinforced it.

One of the courses I took was in the philosophy of science. It was fascinating to see how regularly a new discovery would shock the world, only to have the new paradigm become entrenched until the next discovery shook things up. Astronomy, physics, medicine, etc. have been expanded over and over, right up to the very recent past. I believe that we are presently in the midst of that journey, not at the end. There will be new discoveries, though we can't conceive of what they might be. But they will turn the world upside down again. So I look at scientific understanding as it is in 2010 not as the truth, but rather as something that is a very strong approximation of the truth. Plug the right numbers into the formulae, and you'll get highly functional answers, things you can use to develop technological breakthroughs and carry the world forward. But there is always the possibility of exceptions to the rules.

I carried that attitude into my passion for baseball. The more I learned about the work being done in advanced statistics, the more I wanted to master those new numbers and use them to enhance my understanding of the game. The folks who grew up treasuring RBI, W and Fielding% are going to die out, replaced with a generation that sees deeper value in OPS, VORP and UZR. There's no going back to the pre-Sabrmetric world.

But the new-school thinkers are starting to fall a little too deeply in love with their numbers. It's leading to some new prejudices. The 2 that weigh on me most prominently are these: A hitter can't be very good unless he has a good walk rate, and a pitcher can't be very good unless he has a good strikeout rate.

I'll focus on the pitching side today. The fixation on K/9 stems from a desire to find consistency across a pitcher's career. ERAs fluctuate wildly from year to year, even for pitchers we understand to be really good. It must be something random that causes the variations: the capriciousness of BABIP. K/9, BB/9 and HR/9 don't have anything to do with defensive range or positioning and happen to correlate very well from season to season. So a formula was created to normalize BABIP, and eventually, HR/flyball: xFIP.

This stat is loaded with assumptions (pitcher skill isn't really inherent in HR/flyball) and arbitrary constants (HR get multiplied by 13, BB by 3, K by 2; the whole shebang then gets multiplied by 3.2 in order to make it look like something close to ERA). But it gives a much more satisfying correlation in year over year performance, so it seems to work pretty well. And it gets used a lot when evaluating pitchers and projecting what sort of performance they're capable of in the future.

I recently read a piece by Rob Neyer in which he asserted that none of Jon Rauch, Matt Guerrier, Jesse Crain, Jose Mijares, Clay Condrey or Pat Neshek was a good bet to finish with an ERA much above or below 4.00 this year. They haven't demonstrated the skills necessary to do it. In other words, when you plug their numbers into the xFIP equation, you get something around 4.00, therefore these guys don't have what it takes to put up ERAs around 3.00 on a consistent basis.

Funny thing, though - these guys do put up ERAs around 3.00 on a consistent basis. Here are their ERAs, compared to their xFIPs, for each season they've been full-time relievers (minimum 20 IP):

There are 23 eligible seasons among the 6 relievers. xFIP managed to peg the ERA within half a run in just 5 (22%). In the remaining 18, xFIP is off by 0.61 runs or more. And in only one of those 18 seasons did it underestimate the actual ERA. That means that, with respect to these particular pitchers, xFIP has overstated their actual ERA by more than 3/4 of a run about 3/4 of the time.

Coincidence? Too small of a sample size? Could be. But it could also be that we're missing the trees for the forest: in applying a stat to the whole galaxy of pitchers, we're creating rules that apply to the group better than to particular individuals. I have a hunch that there is a cohort of pitchers out there that balances out the members of the Twins' 'pen. Guys whose ability to prevent runs is consistently overstated by xFIP. I'll try to track a few down and compare them to the Twins group. Maybe it will lead to something important about evaluating pitcher skill that xFIP is missing.

In the meantime, I'd certainly be inclined to be kinda humble about how much it can tell me about a particular pitcher's skills.

No comments: