Proceeds from the ads below will be donated to the Bob Wuesthoff scholarship fund.

Wednesday, November 17, 2004

Treating Junk Stat Pollution

Stephen Smith, who does a wonderful job with Future Angels, has charged after a windmill that never needed knocking down, namely, the idea that OBP corresponds to runs scored. He gives it quite a try, but in the end, fails to prove that which he set out to do. He observes that offensive efficiency could be measured by the number of plate appearances divided by the number of runs scored. Ideally, of course, all your players would hit home runs each time they got to the plate, so the best case is unity, but the worst case -- impossible to occur in the real world -- is infinity (no runs scored but some non-zero number of plate appearances). Smith doesn't do any correlation analysis on his efficiency rating, so I thought I would do it for him. I start in 1930 -- an arbitrary cutoff, to be sure, but I didn't want to spend the rest of my night crawling MLB.com, not to mention there are some major rule changes that occurred around that time (such as the end of calling the modern ground rule double a home run). Here's some correlation figures between three different statistics and runs scored:
Stat      r
OBP     .800
Avg     .735
TPA/R  -.115
So, on base percentage historically correlates well with runs scored, followed by batting average. TPA/R actually has a negative correlation -- as we would expect, considering a good team should theoretically have a lower ratio than a bad team -- but an extremely weak one. (A good correlation can still be negative; the closer to zero it is, either way, the less useful it is.) And this isn't even doing park or league adjustments.

Smith later brings up dat ol' debbil Productive Outs, introduced earlier this year in a Buster Olney ESPN column, to back up his claims. I won't bother flaying the value of this statistic, as it's already been ably done by Larry Mahnken at Hardball Times. Mahnken's comment about Productive Outs seems equally apropos of Smith's treatment of the subject, namely, that "making productive outs is not an important part of winning ballgames" and that nobody -- neither Olney nor Smith -- have shown otherwise. If indeed this is what the Angels are teaching their prospects in the minors, the club is systematically wrecking the careers of the "waves of talent" from a farm system David Cameron labeled "the best in the game".

Update: Apparently the boys over at Baseball Think Factory have glommed onto this. Some good reading there; one reader asserts Smith's strikeout calculations are wrong, and also notices that

Simply looking at who scored the most runs is misleading, because one team could have sent many more players to the plate than the other. So let's find a common denominator.
But's that's exactly it! High OBP teams strive to send more players to the plate because they makes outs at a slower rate! The ratio of runs to plate appearances is irrelevant.

Since he missed this obvious fact at the beginning, the rest of the column is dangles on a broken branch.

David Pinto also shares his thoughts:
The point of Mr. Smith's article is one Bill James made 20 years ago. Given two teams with the same OBA, the team with the higher batting average will have the better offense. Hits are simply more valuable than walks in advancing baserunners.

I will agree that Productive Outs are not necessarily conducive to scoring a lot of runs (which would hinder the ability to win a lot of ball games), but well placed Productive Outs in close ballgames CAN lead to wins. It's the difference between having statisticaly large samples (a whole 162 game season where 20-50 Productive Outs get drowned out in other stats) vs. a single game, where 1 run can matter, and your team's ability to run the bases and move runners over can lead to that run when you might otherwise not be able to get it. The problem is that only one type of Productive Out leads directly to runs (those with a guy on 3rd), while any hit can directly score anyone on base (although scoring from 1st on a single is very difficult).
One significant problem with proving or disproving the proposition that POP is a useful stat is that we have no historical data. The statistic comes from the Elias Sports Bureau, and they don't seem to be getting us data before this year (last time I looked at ESPN, anyway). But like I said, I'm not here to talk about the POP stat; the Mahnken article at Hardball Times pretty much assails that one adequately.
Actually, and I forgot to mention this, Mahnken says that teams with higher Productive Outs scores tend to do worse in close games. Something to think about.

Post a Comment

Newer›  ‹Older
This page is powered by Blogger. Isn't yours?

WWW 6-4-2