Wednesday, September 7, 2011

Link to StatDNA Guest Post

The post is officially up on the StatDNA blog. Go check it out.

As I said in my previous post, this is a very rough and preliminary model. This is why my work was not any sort of formal entry, just some fun with some great data.

I used an Vector Generalized Additive Proportional Odds Model to evaluate the change in win probability for each event listed in the StatDNA data, given the spatial location and time left in the game (as well as the score). Things turned out pretty well for this rough version and the WPA rankings are pretty close to what the EA Sports Index reports at the EPL website. Because I haven't finished the model, I won't release all of the players' WPA from last year. However, I do mention that players expected to be near the top of the list are there.

The most interesting players to me were Wayne Rooney--who finished lower than one might expect--and the up and coming goalie Tim Krul. Given that I'm more of a baseball guy, I was pretty happy with the way these things turned out. A lot of people love Krul, and this analysis seems to support that love.

Anyway, go check it out over there. Below are some fun visualizations which you may find similar to my umpire heat maps or Fangraphs Win Expectancy graphs (which you'll find at the link as well). All in all it was a lot of fun, and I'd like to thank StatDNA for letting me get dirty with the data. If you are interested in soccer, I'd definitely suggest checking them out!


  1. Would you mind sharing this dataset? I'd love to have a dig around...

  2. I would love to be able to share, but cannot. If you are very interested, I suggest emailing someone at Stat DNA to see if they are willing to share.

  3. Not much of a sport enthusiast but i just want to know how this data would be helpful...