Wednesday, October 28, 2009

Umpire Advancement and Opportunity Cost

I'm still in link mode. Been super busy then got sick on top of it. I've been fiddling with the pitcher data for the Hall of Fame, similar to the hitters from before, but have run into some problems with inconsistency in the predictions. It may not be feasible, given the significant structural change in how pitchers are used over time. Anyway, here's a link over to Sabernomics, where JC Bradbury has a really neat new study about umpires, their pay structure, and how to ensure MLB has the best ones at any given time. It's a lot to think about, and really interesting.

Thursday, October 22, 2009

Pitch Classification Algorithms...NEAT!

Busy day, so I'll send you over to The Baseball Analysts, where Moore describes his algorithm for pitch classification. A really cool technique to get this Pitch F/X data to give us answers about pitchers. This kind of thing is really interesting!

Wednesday, October 21, 2009

Not-So-Great Moments in Not-So-Objective Research

As I'm sure you've noticed, I've been pretty busy lately. Today, I'm going to link you over to The Sports Economist, where Phil Miller links a new paper claiming there is discrimination against women in college sports.

First, I want to say that I think women should have every chance that men have to play sports. If there's a female that can hit home runs consistently off of Tim Lincecum, I want to see her in MLB. If there's a girl that can dunk on LeBron, I want to see her on the Lakers. But discrimination is a tricky thing, and the researchers in this paper seem to completely miss the ball to the point of absurdity.

We can divide discrimination into a few categories from a labor market perspective: consumer discrimination (which I mentioned in my previous post on hockey), co-worker prejudice (similar to consumer but with coworkers not wanting to work with certain groups), employer prejudice (the employer just plain not liking women), and statistical discrimination (sort of like how car insurance companies charge higher premiums--this is NOT prejudice, but gets very convoluted and is often misinterpreted).

So let's check out each one of these.

Statistical Discrimination: This one probably brings us to market forces and the laws of supply and demand. If people are more willing to pay to see men play (is it totally outlandish to think that most fans find the men's games more exciting to watch?) then the prices are going to be higher. The historical statistics tell us this is true.

This also happens in hirings, where employers know the productivity of females is more likely to be hindered because of pregnancy, etc. This can cause problems, however, and create a vicious circle in which employers' beliefs are simply self-sustaining. I think we could debunk this in ticket price 'vicious circle' by simply looking at attendances at men's and women's games and understanding that low attendance at a women's game won't increase with price (unless women's basketball has lots of things in common with potatoes in Ireland).

Employer Prejudice: Well, this doesn't really apply if you ask me. The entire basketball team is a women's team and proving something like this is really difficult. We could look at coach salaries between men and women basketball, but this gets us into the same argument about supply and demand. In fact, there's precedent for this at USC and the female coach lost. The idea was that her and the men's coach had the exact same duties and responsibilities, worked just as many hours, but she got paid way less. Her frustration is understandable, but the market to which she coaches is much smaller, resulting in less revenue to pay her with. She also claimed she got fewer benefits than the men's coach (that one is a little sketchy). She lost the case. (Here's another link at USC)

Consumer Discrimination: This could be a very real phenomenon. However, disentangling it from just plain preferences for playing style or quality of play is difficult. Perhaps it's a completely misguided evaluation by fans, but I don't think we can know the answer to that unless the women's and men's team play one another in the same league.

Co-Worker Discrimination: I'm not sure how this would apply. If you count the men's team as co-workers and they're prejudiced against the women's team (definitely not unlikely), then fine. But there IS a women's team, and the men have no bearing on ticket prices of the women's games. So I think, at least in the discussion of ticket prices, this point is moot.

Most likely, if it's any of these, it's consumer discrimination. There very well could be some of that, but to suggest charging higher ticket prices for the women's games will just result in less fans. So that's stupid. As for bringing down men's prices, that would supply the department with less money to subsidize the women's team. Charging low ticket prices does not necessarily CAUSE people to de-value the sport (though the authors apparently cite studies that show this effect can happen--I'm not convinced that is going on here). The causation is probably reversed: low interest causes athletic departments to charge less in order to get more butts in the seats.

Stacey Brook has a discussion here

As for whether or not a non-profit should act like one in order to support all groups equally, that could be up for ethical/moral/political debate. I'm not convinced ticket prices should all be the same. That will simply leave athletic departments with much less than they had before. Football and basketball programs at many large NCAA school support the entire athletic department. The women's teams actually benefit from this. The whole thing seems to be backwards.

Hat Tip: The Sports Economist

Tuesday, October 20, 2009

Discrimination in NHL?

A journalist recently raised some eyebrows reviewing a new book by Bob Sirois that claims he has found significant evidence for the discrimination of French-speaking players in the NHL. I have heard this claim in the past, and have read some economic literature on the topic. While this seems to be news to the likes of Phil Birnbaum at Sabermetric Research Blog, people have been claiming this for over 20 years now with some evidence. I'm a little surprised people find this book release so shocking.

Phil doesn't believe the evidence presented in the review of the book, and with good reason. While I know very little about Sirois, the findings presented don't seem to be too satisfying (at least in a statistical sense). BUT, I'm sure he does shed some new light on the topic. Below is my comment at Sabermetric Research Blog, and if this is a topic that interests you, I'd suggest reading the papers listed at the end:

"I've seen a few papers on this topic coming out of the sports economics literature that seem to confirm the idea that French speaking players really may be discriminated against (at least in terms of salary paid when performance is supposedly even).

The Kahane paper below tries to find if there is inefficiency in that discrimination practice. It's pretty interesting. This would, of course, be expected if there is in fact discrimination taking place.

An older paper I read from 1992 attempts to explain differences through body size and defensive style of play, but I'm not sure about that (as I know nearly nothing about hockey). While generally a mixed bag of findings, the 'discriminatory' area seems to be defensemen.

The 2003 Longley paper attributes the discrimination to customers in certain areas (not that this type of discrimination makes it more right for the hockey teams if it exists).

Here are the ones I know of:
Longley (1995, 1997, 2003)

Kahane (2005)
Krashinksy (1997 for anther view)
Walsh (1988, 1992)
McLean (1992)"

If you have any sort of university search engine, those papers should be easy to find. They're from Canadian Public Policy, Journal of Economics and Sociology, Review of Industrial Organization, and Industrial and Labor Relations Review.

I guess the main difference to see about the newest Longley paper is that it attributes the discrimination to the customers, rather than the firm. If customers get less enjoyment from seeing French-speaking players on their team, then the firm has incentive not to hire those players. Unfortunately for the team, that's not a good excuse for discriminating. It's the same as only hiring white people at your upscale restaurant because you know your customers are rich racist people.

Anyway, it's an interesting topic and I'm surprised Birnbaum seems so unconvinced of the idea. There's also discussions over at The Book Blog and In the end, I don't know who is actually right. I suspect a book with 'statistics' by a former hockey player has some not-so-great arguments, but might give an interesting in-depth look at any possible direct actions taken against French-speaking players.

As suggested by one author, it could be a style of play excuse. Hockey is much more interconnected than baseball, and communication or playing style could have significant effects on how a team as a whole performs. Then again, the Kahane study seems to find pretty explicit evidence against that.

UPDATE: Phil commented that he's not against the idea of discrimination. Just wanted to make that clear. I tried to present the fact that I agree with some of the critiques of the 'evidence' provided in the book review. But he does mention that the accusation is premature, which I have to disagree with given the long history of papers investigating this topic.

NEW LINK: The Sports Economist

Monday, October 12, 2009

Rotisserie Scoring and Reality

Today I ran across this article at a random site I found on a Google search. The author's claim is that Roto leagues are just not representative of real baseball, and that Head-to-Head leagues are infinitely better. Each league type has their ups and downs, of course. He goes into a number of defenses to his claim that could generally be improved by simply adding categories to a traditional rotisserie league. He seems to miss the fact that you can penalize batter strikeouts simply by adding a category for it and reversing the roto scoring. While I agree with some points, I think others aren't very well thought out.

But this got me thinking: How representative are the 5x5 traditional categories toward which teams are actually the best in a rotisserie league?

Now, I imagine people have done this before, but I figured I'd just check it out for the recently wrapped up 2009 MLB season. I took all 30 MLB teams and ranked them in each of the 10 traditional Roto categories (R, SB, AVG, HR, RBI, ERA, WHIP, W, SV, K) and sorted them to give Roto-style discrete score rankings to each team out of 30. Finally, I took a simple correlation of the team's actual 2009 wins, and their projected number of fantasy points on a traditional rotisserie scoring. For all of MLB, the results seem pretty straight forward, as I expected:

Corr(Roto Points, 2009 Wins): 0.9276

To ensure there isn't a problem with the discrete-type scoring in the calculation of the correlation here, I calculated a Composite Z for each team just the way I presented in my guest post over at the Fantasy Ball Junkie and did the same correlation:

Corr(Composite Z, 2009 Wins): 0.9449

It should be noted that the correlation between the Composite Z and Roto Scoring in these leagues is about .98 pretty consistently. It still has the ability to find some anomolies in a standard roto format, however, so don't count it out as not useful (and it can be adjusted in creative ways to find new information). From this, it seems like the Roto Leagues are a pretty good representation of the outcomes required for a good regular season in MLB. Of course, that's not all there is to the game, and managing weekly matchups is a lot of fun, too. I prefer a combination of the two: rotisserie style head-to-head matchups. We have to keep in mind as well that a correlation doesn't tell us a very complete story, but it gives a good idea of the relationships between outcomes in fantasy baseball vs. "real" baseball. It doesn't really tell us the underlying structure or causes of anything, including 'run scoring'.

One of the other things I was curious about was how well the $$ Values (combined pitching and hitting) from Fangraphs correlated with actual wins for the 2009 MLB season. I don't actually know how the $$ values are calculated, so take them with a grain of salt. Below is what I found:

Corr(Total $$, 2009 Wins): 0.8260

Interesting! Here's the disclaimer, though: while the $$ values are theoretically measuring 'true talent', the roto categories are measuring actual outcomes from the season. The actual outcomes should usually correlate higher with the Wins, as they inherently contain the true outcome within them (especially in the W and R categories included in a Roto system). Well what about Pythagorean Record? Can we out-perform that with our Roto categories? Let's see:

Corr(Pyth Rec., 2009 Wins): 0.9051

Corr(Pyth Rec., Roto): 0.9109

So, we see that the roto points actually correlated more highly than the Pythagorean Expectation that so many people hold dearly as a quick estimator of expected wins. Of course, the Roto points are biased due to the W category. But with that, I think it's safe to say that using the Roto points is as good as using a Pythagorean Expectation to predict wins for a team overall in MLB. The difference between these correlations is negligible.

Is fantasy scoring perfect? Absolutely not. But it seems to gauge a fairly realistic level of output that we would want from pretending to be Major League Baseball general managers. The real fun is trying to predict what these players are going to do. We should use true talent measures developed by all of the projection systems out there to do that (in fact, averaging them all together sounds like your best bet). But wait! What about comparing the correlations to point-scoring leauges? Isn't the point here to show that H2H leagues aren't really significantly better at representing true baseball outcomes?

Well, here you go (based on standard CBS Sports H2H Points-Based Leagues):

Corr(H2H, W): 0.9351

So it seems like the point system here does fairly well at representing what would happen on a baseball field. I'd still be willing to argue that the correlation difference is negligible, though. I would also imagine including certain categories in a roto league, as I mentioned earlier, would help to close the extremely small gap in the relationship.

So, in sum, I think it's safe to say that both league types give relatively representative outcome measures of a fantasy team's true relative strength in the long run. However, I'm not sure that the W/L records in H2H leagues are totally representative of the true talent on each team. If standings were based on points only, that would be one thing. However, as I mention in my guest post at Fantasy Ball Junkie, the H2H season consists of a collection of very small sample sizes and asymmetrical scheduling. In those cases, H2H falls far short of most Roto formats, and there are incredibly large point swings in the new CBS scoring format that I really didn't have any fun with this year. All in all, this is simply a matter of preference, and rotisserie isn't necessarily 'just for chicken.'

NOTE: I would actually also be interested in doing this by AL/NL and Division Roto Rankings. Unfortunately, I'm swamped for the next couple months and don't have time time to separate it out 2 more times. It may be fun as an exercise for someone's curiosity, but I don't suspect finding anything too striking (maybe some adjustments, given teams only play their own-league counterparts outside of the short 'interleague play' weeks).

Sunday, October 11, 2009


It seems like this week(end) everyone has been talking about clutch hitting. It's mostly in response to JC Bradbury's response to Bill James' article "Underestimating the Fog". JC seems to think that James' article is misleading. I think he does make some good points about hypothesis testing, and as he says, it's something people wrestle with in academia all the time. But, early on in statistics or critical thinking classes, you're generally taught that you can't prove something doesn't exist. It seems fairly intuitive that, given past research on the subject, it may not be a very fruitful topic to look into at this point. My thoughts are the following (as I post at Sabernomics):

"I think the problem is that to find evidence for clutch hitting (even if there is such a thing), there would have to be some sort of inefficiency in managing strategies between the two clubs. If we assume there is a clutch hitting skill, then why wouldn’t there be a clutch pitching skill (I think you allude to something of that sort in your book, JC).

If that’s the case, then both managers should be optimizing their clutch from the defensive, as well as the offensive POV. Under this assumption, the results (or statistical data) would be a wash and there shouldn’t be significant evidence in the data for clutch hitting. If a manager doesn’t put in their so-called ‘clutchy’ guy when the other manager has in their ‘clutchy’ pitcher, then he isn’t optimizing his strategy. Unfortunately, for those trying to discern a ‘clutch’ skill, the only thing that would happen by putting in the ‘clutchy’ hitter is arriving back at the expectation that we originally had for the event.

But that doesn’t necessarily mean it’s not a repeatable skill. If, for some reason, there isn’t clutch pitching, then perhaps the idea of clutch hitting would be more manageable. Even if it were the case that it exists, I can only imagine it is quite tiny and probably not of interest in payroll as Jim states above. I think this ‘fog’ is just too thick for us to really find anything, and what could be found probably isn’t all that worth finding."

The main idea here is that, if managers are acting rationally to optimize their chances of winning in clutch situations, then we aren't likely to see a difference in the way the game plays out, on average. Why shouldn't there be clutch pitching? Someone like Mariano Rivera is referred to as a clutch pitcher. The problem is he's also just plain nasty. When a manager puts in a pinch hitter because of his clutchiness, the manager on the opposite side should be doing the same thing. This should just leave us with what we expected to happen in the first place, and doesn't leave much room for detection of anything. Any edge that a manger gets from putting in a hitter he thinks is 'clutchy' would just be offset by the other manager putting in his 'clutchy' pitcher--assuming the clutch factor is a constant increase in advantage across clutch players (or close enough to that). So while we don't detect anything, it could be an important part of the in game strategy.

With that said, I don't think there's much value in searching for the clutch factor. I think some players likely don't perform quite as well under pressure, but I'm fairly convinced that they get weeded out early on. There's always immense pressure to perform at the MLB level. If you can't handle pressure, you're not likely to be in the bigs. That's not to say that at a given point, a player in MLB doesn't fail due to pressure constraints. But on average, I don't think it's going to be anything influential over the course of the season. I'd be willing to bet that being 'unclutch' is a possibility from a psychological standpoint at lower levels of play. Unfortunately, I don't think we have the data to truly analyze that, and I don't think it's ever a reason to put a hitter in that, on average, isn't as skilled as the one already at bat.

Given a clutch situation, I'd love to have Derek Jeter at the plate. He's a damn good player and if I'm faced with that situation as a manager, I want my best player out there. If I had a choice between Jeter and Pujols, Bonds or A-Rod, well I wouldn't have to ponder too much. I'll take the guys that make hitting Major Leauge pitching look like child's play.

Addendum: I forgot to add some links about the discussion. Below are some other discussions, as well as a 'quick study' over at Sabernomics that gives a quick look at the 'effect' we might be trying to find.

Quick Study at Sabernomics

Discussion at Sabermetric Research Blog

The Blook Blog's Blurb
Note: I'm not sure why Tango thinks this isn't a 'practical problem'. Of course it is. It's trying to find the optimal way to manage, which makes it extremely practical (though probably not all that useful in the end). Just because teams don't currently rely on it, doesn't mean the motivations behind trying to find it aren't 'practical'. Maybe I'm being nitpicky with semantics, but this seems to be a point that Tango is really interested in making for some reason.

Thursday, October 8, 2009

Addicted to Baseball?

Sorry for the very scarce posting lately. I'm actually working on research projects with 3 different professors and taking 2 and a half classes right now. It hasn't made for much time here. I had a post about my fantasy season, but decided that it was simply not interesting to anyone but me, so I had trimmed it down to players that best helped or hurt my teams this year, and who may be in that position next year. Right before I was about to post, I caught this post over at Fantasy Ball Junkie and I didn't want to be a copycat.

Anyway, this post is in reference to a paper I recently read by Young Hoon Lee. Dr. Lee is an econometrician from Korea who researches sports (especially baseball), among other topics. I've been in contact with him about a few things related to GAUSS, but I haven't had extensive discussions. He's wicked smart, and he's quite a resource for econometric questions. A recent paper he released in 2008 with Trenton Smith (Washington State) discusses economic models of addiction and how Americans may actually be addicted to baseball. The addiction is in the same sense as one would be addicted to anything else. In economic terms, addictive goods are characterized by "increasing marginal utility of consumption". In non-nerd terms, that means the more you buy something, the more fun you get out of it per thing bought. So, in baseball terms, for each game you attend, the next one is even more enjoyable for you than the last. The paper goes into not only economic addiction, but anthropological and psychological theories of addiction and sport fandom as well. If you have academic access, I'd recommend reading it. Here's the citation:

Smith, T. & Lee, Y. H. (2008). Why are Americans addicted to baseball? An empirical analysis of fandom in Korea and the U.S. Contemporary Economic Policy, 26, p. 32-49.

Apparently, Koreans are not addicted to baseball, while Americans are. This is a topic I'd actually like to get my hands dirty with once I start up my dissertation and possibly extend to other sports. It really sounds like a good tie-in between Economics and Management in sport. I wonder if there are papers about this from the perspective of the participant, rather than just the fan. I know as a player, I would feel withdrawl-type symptoms (not physically, but emotionally) when I had to stop playing baseball. I still miss it, and slow-pitch softball just doesn't fill in that empty gap of competitiveness that baseball does.

Do we have any admitted 'baseball addicts' in our midst?

Thursday, October 1, 2009

NHL and Jim Balsillie

Quick link over to The Sports Economist, as they have a link to the story that a judge has ruled in favor of the NHL, rejecting Jim Balsillie's offer to buy the Phoenix Coyotes. Seems like some serious implications there. I'll leave the discussion of those up to the economists.