Monday, October 12, 2009

Rotisserie Scoring and Reality

Today I ran across this article at a random site I found on a Google search. The author's claim is that Roto leagues are just not representative of real baseball, and that Head-to-Head leagues are infinitely better. Each league type has their ups and downs, of course. He goes into a number of defenses to his claim that could generally be improved by simply adding categories to a traditional rotisserie league. He seems to miss the fact that you can penalize batter strikeouts simply by adding a category for it and reversing the roto scoring. While I agree with some points, I think others aren't very well thought out.

But this got me thinking: How representative are the 5x5 traditional categories toward which teams are actually the best in a rotisserie league?

Now, I imagine people have done this before, but I figured I'd just check it out for the recently wrapped up 2009 MLB season. I took all 30 MLB teams and ranked them in each of the 10 traditional Roto categories (R, SB, AVG, HR, RBI, ERA, WHIP, W, SV, K) and sorted them to give Roto-style discrete score rankings to each team out of 30. Finally, I took a simple correlation of the team's actual 2009 wins, and their projected number of fantasy points on a traditional rotisserie scoring. For all of MLB, the results seem pretty straight forward, as I expected:

Corr(Roto Points, 2009 Wins): 0.9276

To ensure there isn't a problem with the discrete-type scoring in the calculation of the correlation here, I calculated a Composite Z for each team just the way I presented in my guest post over at the Fantasy Ball Junkie and did the same correlation:

Corr(Composite Z, 2009 Wins): 0.9449

It should be noted that the correlation between the Composite Z and Roto Scoring in these leagues is about .98 pretty consistently. It still has the ability to find some anomolies in a standard roto format, however, so don't count it out as not useful (and it can be adjusted in creative ways to find new information). From this, it seems like the Roto Leagues are a pretty good representation of the outcomes required for a good regular season in MLB. Of course, that's not all there is to the game, and managing weekly matchups is a lot of fun, too. I prefer a combination of the two: rotisserie style head-to-head matchups. We have to keep in mind as well that a correlation doesn't tell us a very complete story, but it gives a good idea of the relationships between outcomes in fantasy baseball vs. "real" baseball. It doesn't really tell us the underlying structure or causes of anything, including 'run scoring'.

One of the other things I was curious about was how well the $$ Values (combined pitching and hitting) from Fangraphs correlated with actual wins for the 2009 MLB season. I don't actually know how the $$ values are calculated, so take them with a grain of salt. Below is what I found:

Corr(Total $$, 2009 Wins): 0.8260

Interesting! Here's the disclaimer, though: while the $$ values are theoretically measuring 'true talent', the roto categories are measuring actual outcomes from the season. The actual outcomes should usually correlate higher with the Wins, as they inherently contain the true outcome within them (especially in the W and R categories included in a Roto system). Well what about Pythagorean Record? Can we out-perform that with our Roto categories? Let's see:

Corr(Pyth Rec., 2009 Wins): 0.9051

Corr(Pyth Rec., Roto): 0.9109

So, we see that the roto points actually correlated more highly than the Pythagorean Expectation that so many people hold dearly as a quick estimator of expected wins. Of course, the Roto points are biased due to the W category. But with that, I think it's safe to say that using the Roto points is as good as using a Pythagorean Expectation to predict wins for a team overall in MLB. The difference between these correlations is negligible.

Is fantasy scoring perfect? Absolutely not. But it seems to gauge a fairly realistic level of output that we would want from pretending to be Major League Baseball general managers. The real fun is trying to predict what these players are going to do. We should use true talent measures developed by all of the projection systems out there to do that (in fact, averaging them all together sounds like your best bet). But wait! What about comparing the correlations to point-scoring leauges? Isn't the point here to show that H2H leagues aren't really significantly better at representing true baseball outcomes?

Well, here you go (based on standard CBS Sports H2H Points-Based Leagues):

Corr(H2H, W): 0.9351

So it seems like the point system here does fairly well at representing what would happen on a baseball field. I'd still be willing to argue that the correlation difference is negligible, though. I would also imagine including certain categories in a roto league, as I mentioned earlier, would help to close the extremely small gap in the relationship.

So, in sum, I think it's safe to say that both league types give relatively representative outcome measures of a fantasy team's true relative strength in the long run. However, I'm not sure that the W/L records in H2H leagues are totally representative of the true talent on each team. If standings were based on points only, that would be one thing. However, as I mention in my guest post at Fantasy Ball Junkie, the H2H season consists of a collection of very small sample sizes and asymmetrical scheduling. In those cases, H2H falls far short of most Roto formats, and there are incredibly large point swings in the new CBS scoring format that I really didn't have any fun with this year. All in all, this is simply a matter of preference, and rotisserie isn't necessarily 'just for chicken.'

NOTE: I would actually also be interested in doing this by AL/NL and Division Roto Rankings. Unfortunately, I'm swamped for the next couple months and don't have time time to separate it out 2 more times. It may be fun as an exercise for someone's curiosity, but I don't suspect finding anything too striking (maybe some adjustments, given teams only play their own-league counterparts outside of the short 'interleague play' weeks).

