Wednesday, June 16, 2010

The Merits Of A Pitcher's Stats Against One Team

Mike from The Yankeeist has some strong words for the way media built up the return of Roy Halladay last night:
The problem wasn't the hype itself, but how it was justified. There should be pregame excitement anytime the Yankees face a pitcher who is having the kind of season Doc is having. But the game was sold to fans on the basis of Halladay's career numbers against the Yankees. This kind of sloppy journalism is prevalent in the baseball media, and should be criticized.
I think not citing Halladay's career numbers would have been a far more egregious error. It's a valid storyline. He'd pitched more than a full season's worth against the Yankees in his career and had great results. How do you not bring that up?

Of course, as Mike goes on to explain, even though Doc had thrown a ton of innings to the Yanks, the numbers he compiled don't really mean anything:
The baseball media frequently cites a player's career performance against a given team to provide insight into how that player should do right now against that same team. This makes no sense. Sticking with the current example, Roy Halladay has been logging time in the AL East since 1998. How, exactly, do his numbers against Scott Brosius or Jason Giambi help understand what he can be expected to do when he faces the Yankees in 2010? The answer, of course, is that they can't, but baseball struggles to grasp this.
In addition to the Yankee lineup fluctuating, the defense behind Halladay has constantly shifted, the turf underneath him in Toronto has changed, he's thrown to tons of different catchers and most importantly, he's evolved as a pitcher. Essentially, what you are looking at when you try to analyze Halladay's career stats against the Yankees (or any pitcher's line against a certain team) are a bunch of very small samples, recorded over a very long time, and smushed together to look like one big one. And, by the way, looking at his career numbers against the Yankees tells you the same thing that looking at his overall numbers tell you - Roy Halladay is like, really awesome at pitching.

But there's two parts to this - the validity of the stats and how they are being used. Mike calls it "sloppy journalism", but that strikes me as a sort of hollow, straw man argument. Its' not like it's factually incorrect. Halladay has owned the Yankees and you can bet that the players in the clubhouse are keenly aware of it. Why then shouldn't the media talk about it? It doesn't have predictive value, but no one I care to read was trying to predict what the outcome of the game would be anyway. It was one of things that was ubiquitously noted because, well, it was worth noting.

Did anyone guarantee that Halladay was going to turn in a great game? I'm not aware of such a proclamation. I think most Yankee fans sort of braced for impact (like Mike's co-author at the Yankeeist, Larry did) simply because after seeing it happen so many times, it's a natural reaction.

So why do we fans have the desire read about games before they happen? Why do we bother to write a preview for all 162 of them here? It's rare that something your read beforehand will manifest itself in the game in a meaningful way, isn't it?

In general, we are all probably a little anxious for the game to start and reading about it helps pass the time. We love the team and find that harnessing some of that anticipation and reading up on an impending game helps us look forward to it.

More specifically, when I write a preview or read one that someone else has written, it's because I want to have an understanding of any trends and storylines coming in and try to develop some sort of a framework that will make what's unfolding on the field a little more coherent and interesting to me. Some of those things might be statistical, but the personalities and rivalries and the who-owns-who are compelling in their own way, even if they don't pass the statistical smell test.


  1. Yeah it's no surprise that the "vs. team" stats are as useless as they are ubiquitous. I think that ultimately they fall in to the "interesting" category, rather than the "insightful" category. The fact that Roy Oswalt, for example, is 1,000,000-2 against the Reds in his career is interesting. Ie, it's notable that it happened even if it doesn't say a word about how he'll do against Joey Votto.

    If anything I would call it slightly lazy journalism, rather than sloppy.

  2. We are definitely on the same page, Ted, and I think that last part is an important distinction.

    Lazy is reaching for the low-hanging fruit, sloppy is referring to an apple as an orange. I won't pretend to have read everything that was written leading up to the game, but I'm guessing anyone who refers to themselves as a "journalist" (along with plenty of good bloggers who don't) were doing the former as opposed to the latter.

  3. I agree it's definitely something that should fall in the "interesting" category and be left at that. But the Yanks since 1998 have been winning at a .600 clip (1177-763), so I think the interpretation of his record against the Yanks should be taken as "he's an damn good pitcher", not "he is a lock to shut down the yanks lineup"

  4. Well said, Jay, and appreciate you taking the argument a step further and breaking it down.

    To be completely transparent, I fully expected Halladay to dominate the Yankees last night, even though as Mike pointed out, they have occasionally had success against him. Unfortunately the sheer number of times he's tossed a one- or two-run gem have, for whatever reason, deeper bearing in my memory than the Yanks teeing off on him. Probably because the latter doesn't happen particularly often.

    Of course, the reason we all tune in regardless of the numbers is because none of us know what's going to happen any given night. Anyone who tells you they predicted Halladay would give up 6 ER to the Yankees -- for the first time in 10 years -- is lying, and given his historical results, the safe money would've been on Halladay at the very least authoring a quality start. Funny how baseball works sometimes.

  5. First, I want to thank Jay for linking to my post. I believe that the greatest compliment a blogger can pay one of his peers is linking to his work. It sends a chill down my spine every time I see a link.

    Regarding the topic, regardless of whether it's sloppy or lazy to use a player's career numbers versus a single team as part of a preview or piece of analysis, it is misleading, and totally common place in baseball. My point with the post was only to show how with a little work for any of these big games all members of the baseball media - and I most certainly include myself - could do a better job, but consistently don't put in the effort.

    Not to beat a dead horse, but YES loves pointing out that the Yankees own the Orioles. This is true, but it's misleading. The O's have a bunch of rookie pitchers, and there is little reason for us to own these guys, just because we owned the guys throwing in 2005.

    This is precisely what has happened with the Rays. In 2005, I believe, ARod had 10 RBI in a game against the Rays, but we wouldn't use his career numbers against them (probably awesome) to forecast how he'll do the next time they play the Yankees.

    My general point, and I may have failed to make it as well as I could, was to draw attention to a lack of attention to detail in a lot of baseball analysis, that sets up the big storyline to fail. In this case, there was evidence to suggest the Yankees would get to Halladay, but it was ignored, probably to promote the more exciting story. In general it is using stats in ways that are misleading. Referencing a player's career totals against a team is, I beleive, the worst offender, but there are numerous examples. I just took an opportunity to post about something that has bugged me for a while.