Friday, November 13, 2009

Another Link and an Interview

Today, I'm at home sick so I have a little time to sit here at the computer. I briefly mentioned a problem I have with the general "Sabermetric Community" when I posted about the argument on replacement players at The Book Blog (that actually stemmed from an unfounded accusation that a bunch of "stupid economists" wrote a flawed paper on Factor Analysis--not really an Econometric technique). However, I think JC Bradbury does a much better job than I in his interview at Chop n' Change.

Bradbury sums up my thoughts on interactions with this group of people pretty well. The general pattern on many sites is to simply ignore or misrepresent any sort of perceived conflicting view (even if that view isn't actually conflicting with anything). Now, I am not here to state that these people aren't intelligent, or that it doesn't happen to some extent on both sides of the issue. To the contrary, many of them are very smart people, but with an unfortunate arrogance that I don't understand. In my discussions with top sports economists, any inconvenient truth presented seems to simply be ignored or, as Bradbury puts it, "chastised without heeding the point."

An example is that of my previous post on discrimination in the NHL. While Phil Birnbaum claims that the book is making "premature accusations", the phenomenon of this discrimination has been documented and studied for more than 20 years in the sports economics literature (if that interests you, see the citations in my previous post). Despite mentioning these papers--supplemented by a sarcastic yet friendly post by sports economist Rodney Fort about making sure to be well read on a subject before heavily criticizing it--went unheard for the rest of the thread. The conversation continued as if this was truly a new problem.

This isn't an isolated incident. I recently read an article over at the Harvard Sports Analysis Collective that essentially looked at a time series of competitive balance. While I think this website is a great learning tool for Harvard students, it amazes me that their resident Harvard statistician allowed this article to be posted. There are a couple of reasons for this. The first is simply that taking the standard deviation of wins is problematic when comparing across years. The number of teams and games has changed dramatically over this time, making it very difficult to compare across seasons. In addition, the competitive balance change has been well-documented by Rodney Fort and Young Hoon Lee in a series of papers from 2005 to now (and probably will continue). That DOES NOT mean that further analysis is inappropriate. To the contrary, more inspection is needed. However, presenting work with no reference or understanding of the problems is troublesome. Finally, allowing these students to take others' work on the internet as a given isn't something we would want going on at an institution like Harvard. In fact, the last thing we want is for Harvard graduates and students to participate in what Bradbury calls a "groupthink attitude".

Finally, Bradbury mentions an article at The Book Blog that completely abuses a model developed by John Hakes and Skip Sauer. I had in fact read the article by Tango and was appalled at the misuse of the model myself. I began writing a response explaining the difficulty with extrapolating a regression outside of a sample, but decided it would simply fall upon deaf ears. At this point, I just don't bother. It seems that others that do not look upon Tango as some sort of cult leader have given up as well.

I am extremely excited about a forthcoming special issue of The Journal of Sports Economics that discusses many of these problems in depth by some of the most vocal, and most knowledgable, economists in the field of sport. Hopefully self-proclaimed "subject matter experts" will take some of the implications in the issue to heart. However, my expectation is that none of them will bother to read it.

The current state of The Book Blog reminds me of sitting in MBA economics classes at the Business School here at Michigan. Without understanding of how models are simplistic in order to explain expectations of markets, arrogant students consistently chastise the professor because of a single example they ran into at work. The professor's response is always, "Well, of course there are anomolies, but on average X happens" as if he was waiting for it. This statement just doesn't get through to people for some reason, despite its generality. The thing that blows my mind about this entire problem is that Sabermetricians believe they are critical thinkers. They are people who ran into others ignoring their opinions for years or criticizing their silly 'statistics' based on small sample sizes for years. Yet, the arrogance continues to blind minds to the fact that economics and Sabermetric study are so interrelated that ignoring the basic economic principles can be counterproductive in progressing the science.

While I continue to post things on this blog, I want to reiterate that what I post here IS NOT something that should be taken as science. Most of what I write is a general brain dump, or interesting tidbits and extensions using projects from my statistics classes. I hope to have a monthly disclaimer to ensure this is understood, and that fostering discussion and well-read arguments is part of my intention. My ideas on this site are not to start an online pissing match, or to out-do anyone else. Please see my Introduction as to how I think about the things I write. I try to write with the utmost care, but can make mistakes. I hope they are pointed out in a manner conducive to discussion.

So let's all take one from Rodney Fort's book, as he says, "Let's all READ MORE ."

ADDENDUM: Here's the Rosenthal article...where he supports the idea of sabermetrics and claims their findings have greatly enhanced our understanding...yet gets heavily criticized elsewhere on the internet.


  1. My background is in political science, not economics, so I consider factor analysis a perfectly normal starting point for any statistical discussion. Regardless, I think that you are broadly right about the difficulties of some parts of the sabermetric community. If you come from an academic background, you have a different set of ingrained assumptions about what to do with criticism. As Bradbury mentioned, if you want to challenge the Hakes and Sauer model, you submit a detailed refutation to either the authors or the journal. At that point, the original authors are expected to respond.

    One enormously important thing missing from much of this work is a basic literature review. Start by really knowing the state of the field, and then move on to make some positive contribution to the sum of human knowledge.

    Like you, I blog for fun. This is not my work, but it is interesting, and it saves my wife from having to listen to my thoughts on sports at length.

  2. Thanks for the response sportsPhD. I think the readership is up to 4 or 5 now, haha.

    I absolutely consider factor analysis a reasonable statistical technique. I actually think FA and PCA has been VERY underutilized in the analytic baseball literature (I've only seen it done by Pizza Cutter at Stat Speak, but not much follow up). The JQAS paper I was speaking of was not a very good implementation of it, however. It was written by, I'm assuming, an undergraduate and his advisor. My problem with it's presentation on The Book Blog was that it was said to be "another example of economists not knowing what they're doing."

    In general, I wasn't impressed with the paper. It really missed the ball and, as I said, these techniques used proficiently could really help with things like aging and how player "types" differ. One of the 'Factors' they concluded represented "ERA", which is an obvious waste of using can just use ERA, which isn't that great of a metric to begin with.

    I really do just blog for fun. I enjoy your site as well (hence the link on the sidebar). Not only does it help me to vent on sports, but I'm able to organize my ideas here and come up with new things to do at work. Thanks again for your comments.

  3. I'd suggest the people that have problems with Tango go over to his site and post questions or refutations there. Out of the serious baseball bloggers I am aware of, he's one of the most (if not the most) responsive to people that question his methods and conculsions.

    FWIW, I've also suggested that Tango should submit a refutation in the "primary lit."

  4. Hi, Millsy,

    Never in my post did I argue that discrimination doesn't exist. What I said was that the evidence presented in the newspaper article does not, by itself, prove discrimination, because there are other, plausible explanations for the observations cited.

    Indeed, I am open to the idea that other evidence might show discrimination exists, which you can see if you reread in my last few paragraphs.

    My post was not about discrimination. It was about whether the evidence cited proves discrimination. Which it does not, at least not unless combined with other evidence (evidence that did not appear in the article).

    The fact that other academic studies have shown evidence for discrimination is irrelevant. "A implies B" can still be false even if B is true.

    I hate to sound like I'm rubbing it in, but I don't think it's me who "chastised without heeding the point." :)

  5. Millsy, I'd be curious to hear your thoughts on why the sports economists seem to have such trouble interfacing productively with the sabermetric field while the baseball physicists have been able to foster a very mutually beneficial relationship.

    It doesn't seem like sabermetric groupthink and arrogance is really an adequate explanation for the contrasting responses that the two groups of academics have gotten.

  6. Millsy: I appreciate your desire for a tone that is "conducive to discussion." But I'm not sure you've been faithful to that standard here. For example, you mention your contribution to Phil Birnbaum's post on discrimination, and say you "went unheard for the rest of the thread." Sounds pretty closeminded, maybe even disrespectful. But then I follow the link and find this comment from Phil: "Some commenters are assuming that I don't think there's bias at all. That's not the case: I am agnostic on the question. I just dispute the Sirois arguments for it, and I haven't seen any other evidence for it. There might well be good evidence in the papers Millsy cites." And the rest of the thread is actually a nice illustration of how Phil listens to contrary views and is quick to change is mind when presented with contrary evidence.

    Similarly, you cite Tango's critique of Hakes-Sauer. But if you followed that discussion at all, you must know that Tango has pointed out 4 or 5 times that his objection is not the one you infer, that the model fails to correctly value out-of-sample extreme players. Rather, Tango is arguing -- based on his own extensive research comparing salaries to precise measures of player value -- that the H-S salary projections for a range of real players are inconsistent with actual salaries. Now, he may be wrong about that. But he is absolutely not making the argument you attribute to him and various Michigan MBA students.

    So I have to wonder, are you in fact trying to start a "pissing match" in this particular case? Or do you not understand Phil's and Tango's plainly-stated arguments? Or is it something else?

  7. "I am extremely excited about a forthcoming special issue of The Journal of Sports Economics that discusses many of these problems in depth by some of the most vocal, and most knowledgable, economists in the field of sport. Hopefully self-proclaimed 'subject matter experts' will take some of the implications in the issue to heart. However, my expectation is that none of them will bother to read it."

    Well now, that depends. Is there going to be a price point for it less than $23.00, print only (or $25 per article in PDF form)?

  8. Guy,

    If you read my original post about that here, you can see that I do in fact acknowledge Phils agnostic view. However, the title was 'premature accusations', which is misleading.

    Apparently I have started a sort of pissing match, which was in no way my intention. You may make a very valid point here. I do think, however, that everyone else has done quite a nice job of responding. So thank you for that.

    As for the Hakes and Sauer criticism, the only post I had read before my post here was the original one which I really do think was misrepresented. Maybe I should have waited. But I also tink Tango made viable arguments about the possible problems. Though, I do not think it completely refutes the findings, which seems to be the implication made.

    Hey Colin,

    Actually, I think that's a fair point. I do know that the working version of the Introduction to the issue is on if you are interested. However, I think the very interesting one for everyone to read will be a joint paper by Bradbury and Berri (and likely contentious!). I have no idea what to expect there, and I do hope that it's levelheaded.

    I really think the reaction to the Rosenthal article was way too defensive, which is my point here at this site. Perhaps overstated in a way that has rubbed people the wrong way, but that seems to be happening on both sides of an argument that, in my opinion, shouldn't be so contentious to begin with.

  9. I presume that this is what you're referring to:

    I've skimmed it, will read it again before commenting. (And the Bradbury/Berri paper, which as you note is probably the one with the most relevance to this conversation, is referenced but its key points aren't disclosed.)

  10. Yes, that's the one. I think the issue, in general, is one that many sabermetric people could contribute knowledge to. Especially when it comes to reliability of data. If there's one thing I wish I had skills for, it's data mining and using software like Perl and SQL.

    As for the Bradbury/Berri paper, you're right that it leaves out much of what they plan on talking about. I am extremely curious/excited as to how the paper will be viewed, given that the 2 most opinionated sports economists I know of are teaming up to write it.

  11. Millsy:
    Thanks for the reply. You are obviously trying to find common ground here, which I appreciate. My one additional suggestion is that you try to step back and take a fresh look at The Book Blog and Phil's blog, on one side, and compare them to Berri's and Bradbury's blogs on the other (just as examples of the "camps"). The former are full of open disagreements between frequent posters and the bloggers (and in the case of the Book Blog, even between MGL and Tango). Are there a few fans out there who believe something simply because Tango says it's so? I'm sure such people exist, but I find it hard to believe they are especially numerous. And I know for a fact that they rarely comment on his blog. The community there subjects each other's analysis to just as tough scrutiny as any economists receive. And you wouldn't have to search long to find examples of the bloggers acknowledging they were wrong about something, and changing their mind accordingly.

    In comparison, the commentary at Berri's blog is now almost all from fans who have drunk the WOW kool aid. And that's because he generally ignores or bans all contrary opinions. JC's approach is generally the same, though he has engaged his critics more lately. Neither ever concedes error (though both have made plenty). And even if you believe their combative stance regarding us "amateurs" resulted from our arrogance or other sins, I don't even see any serious criticism among the sports economists themselves. Berri and Bradbury often admire each others' work, and seem to have a mutual defense treaty vis-a-vis non-academics; similary, Sauer's site rarely features criticism of work by those in the community. Maybe this criticism takes place within the priesthood, invisible to us outsiders; but if so, it still doesn't do nearly as good a job at discerning flaws as the on-line process does.


    "I think the very interesting one for everyone to read will be a joint paper by Bradbury and Berri.....I do hope that it's levelheaded."

    Surely you jest!

  12. I will say I have seen some criticism publicly within the circle. Zimbalist questioned Bradbury's book somewhat. Some of his criticisms were fair, and others made me think he didn't read everything carefully. Again, Fort discussed the model on Phil's site as well. Yes, I do think there are people that drink the proverbial kool aid in many realms.

    Within the circle, I think economists in general criticize each others' work with a vengence, though I guess you're right that we don't see as much of it publicly. This could also be due to professional courtesy of sorts. I think this is what Bradbury refers to when he defers to the review process. A non-sports example of public criticism would probably have to be Krugman and some of the articles he writes.

    Thank you for pointing out that my hope really IS to find common ground. Perhaps my methods here were harsh and that idea was lost in translation. I would like everyone to know that I DO think there is a wealth of opportunity for all.