Monday, June 6, 2011

A Sabermetrics Prediction Competition at Kaggle?

I ran across this post today at Big Computing (now on the sidebar). I've toyed around with the Kaggle competitions in the past, but haven't really been able to come up with serious competition beyond the basic data mining tools available in R. They work great, but there are some serious programmers that develop their own classifiers and prediction tools that outclass anything I can do (especially in my free time).

Anyway, there is a mention about a sabermetric prediction competition. I know there are plenty of people around here that would have a lot of fun with something like this. If you haven't been to Kaggle before, I highly suggest checking it out. They give out money for the top predictive techniques. They provide the training data, and a hold out test sample for the leader board. Most recently, there is a Heritage Health Prize, with the winner getting a multi-million dollar prize!

They're asking for suggestions, and here is mine:

"I really think using FX data would be a good road for this. For the most fun, it may be interesting to predict whether a single pitch is made contact with or not, given the game state, type of pitch, count, the opposing batter abilities, pitcher ability, velocity, location, etc."

Any other thoughts?

