Tuesday, January 26, 2010

FBJ's Top 100 Keepers

Recently, FBJ had a post about some of the best keepers based on draft position for the 2010 season. However, our editor over at Fantasy Ball Junkie recently queried the rest of us about putting together a list of the top keepers coming into the 2010 season. I told him I could probably put something together that not only counted the draft slot surplus (or places moved up in Median Draft Position from 2009 to 2010, which is an article already posted at FBJ) but also weighted it by the expected gain needed to move up each of the draft slots. For example, moving from #75 overall to #10 overall is most likely better than moving from #165 overall to #100 overall. The methodology is explained in the article here. And the Top 100 list is here. In addition, I thought it would also be important to ensure we weight by conservative projections for each player in order to not put superb 1-year wonders all the way at the top (for example, J.A. Happ comes in at #53 and Elvis Andrus at #59).

It's important to keep in mind that the Top 100 only includes the Top 200 draft picks for 2010. Any player chosen beyond the 200 mark in the 2010 Mock Draft Central drafts is not included. Assuming a 12-team league, this seemed reasonable to us, since moving from the 300 to the 245 spot isn't really very significant of a talent jump, and there's probably plenty of other guys around there. For this reason, interesting keepers like Randy Wells (315 ADP) are left off in favor of the likes of Kevin Youkilis and Torii Hunter. Any player that moved down in the draft was not included (obviously, since you can get him at a lower price than last year). This fact may make the list somewhat controversial. What's important is to focus on the top 60 or so on the list.

My original inclination was to run a censored/truncated regression on the draft positions. Unfortunately, due to the non-independent nature (and ultimately, zero-sum nature of draft movement) this wasn't really possible. I'm still looking into more non-parametric ways to do something like this, but they're currently beyond my scope.

I began by creating "10-slot weights" by calculating a matrix of the average differences in talent(based on CHONE and FBJ's player rater) from the Top 10 Picks and the 11-20 Picks, the 11-20 and the 21-30 Picks, and so on for any difference (by group of 10) possible. It should be noted that FBJ's player rater values steals a lot higher than most would, so this is one reason we see Michael Bourn so high.

Interestingly, there were some NEGATIVE values when moving up in the draft in terms of expected performance for 2010. Because of this, I just roughly smoothed out the differences for those areas with a local linear regression of those jumps around that point (or, really, a simple average). The higher the difference, the more difficult the move up that far in the draft. Using this, I multiplied the number of draft positions moved up by the given weight. This is a rough score of weighted draft position.

From there, I felt it necessary to also weight this rough score by the player's projected performance for 2010. It's important to also somewhat take into account what you're getting for the price, rather than just what a bunch of other people think. Because the player rater is built to cap around 10 (though, Pujols put up a whopping 13+ last season), I didn't simply multiply by this number, but used "RaterScore^1.5". This way, we can exaggerate the expected performance for better players in order to push small increments up a little bit.

From there, I was torn. Using this simple calculation left me with Joe Mauer down at the 60th-ish spot for keeper value and Justin Upton at only #16, while Michael Cuddyer came in at #12 and Kurt Suzuki came in at #19. I'm not convinced that's accurate. In order to further correct this, I got a little subjective. Since each player in the order 1-300 should be harder and harder to jump, given a certain performance, I used the following to divide the resulting score above:

(CurrentMDP/300)*(MDPJump/PreviousMDP)

In other words, my further scaling weight is the product of 2 ratios. First is the 2010 Median Draft Position divided by 300 (the censored cap for draft position). The other is the increase in MDP from 2009 to 2010, divided by the 2009 MDP. At this point, I wish I had more justification for this (and there may be a little better way, I'm still tweaking things). This gave me the scores shown over at FBJ, and they seem reasonable. However, I think they allowed the likes of Chase Utley (#29 going from MDP of 12 to MDP of 4) and Mark Texeira (#58, going from MDP of 11 to MDP of 8) jump up too high. However, I think it properly weights Mauer and A-Rod (who came at a HUGE discount for a lot of people last year) as some of the better keepers around. Anyway, I'd love to hear any thoughts anyone has on this, or a more methodological way of doing things. This was a lot of just sticking things together, and I think I got lucky that it turned out so well, with the following Top 20:

1. Mark Reynolds
2. Aaron Hill
3. Michael Bourn (remember, FBJ rater overvalues steals)
4. Matt Kemp
5. Justin Upton
6. Adam Lind
7. Joe Mauer
8. Alex Rodriguez
9. Kendry Morales
10. Zack Greinke
11. Troy Tulowitzki
12. Ben Zobrist
13. Chris Carpenter
14. Tommy Hanson
15. Prince Fielder
16. Felix Hernandez
17. Nyjer Morgan
18. Gordon Beckham
19. Andrew McCutchen
20. Jacoby Ellsbury

One last thing to keep in mind: this does not take into account likely breakouts or significant steps forward. Honestly, I think a guy like Yovanni Gallardo is in for a large jump in ability. He didn't crack the Top 100 on this list because he only jumped from an MDP of 115 to about 96. Given the talent from 96 to 115, Gallardo's conservative projection, and the fact that 15 spots just isn't that much that late in the draft, he doesn't get a huge score based on these calculations. The calculations also ONLY take into account the price paid in 2009, and DO NOT take into account the future possibility of keeping a player (in which case, I would say younger players are much better). This is ONLY a 2009 to 2010 keeper ranking.

Monday, January 25, 2010

Update on My Blogging

For the 2 people that read this blog on a regular basis (one being myself), I'm sure you've noticed that drop in posting of late. Work has really gotten the best of me here at the office, and it's best for me to devote my time as a graduate student than put forth extensive analysis on a blog. I do find this endeavor rewarding and helpful in coming up with ideas for my work, but it ends up taking away from my other responsibilities.

For this reason, I have decided to focus on my research at school, with my online focus being Fantasy Ball Junkie posts for now (I will cross-reference them here) and maybe have a 'Weekly Commentary' on sports and sports business issues here, with much less thoughtful analysis. I'd absolutely recommend keeping up with FBJ, especially given the upcoming baseball season and fantasy drafts. I enjoy working with fantasy a little more and that's what blogging is about for me: fun. Blogging about the same things I work on in the office ends up feeling like more work.

I know, I know: how terrible that I have to keep thinking about sports all day. But really, it can take some of the fun out of what I really enjoy about sport competition itself. And I don't really like posting half-assed analysis when it comes to statistics. Putting forth a full effort takes a lot of time, and I don't want to be in the business of publicly publishing something that isn't a full effort. I will continue to lurk around saber and sports econ websites, but likely won't be as active. I'll provide links to really cool stuff I see going on, and try to comment on news stories with some sound logic to combat the crazy mainstream media writers. This could change once the summer months come along, as I have a little more time then.

Monday, January 18, 2010

Martin Luther King Day

Given that it's Martin Luther King Day, I figured I'd give everyone direct links over to Sports PhD's articles on "Men Who Changed Baseball", highlighting some of the first African-American players in the game. This is a great series for a blogger to be writing, and I honestly think it should be picked up by something mainstream. Anyway, the links are below:

1. Jackie Robinson

2. Larry Doby

3. Hank Thompson

4. Monte Irvin

5. Sam Jethroe

6. Minnie Minoso

7. Bob Trice

I'd also like to link Verdun2's Blog (a regular commenter at Sports PhD). His blog often looks at historical baseball stories, but he also has an interesting post on a little-known catcher, Charles Thomas.

Wednesday, January 13, 2010

Revisiting the 2009 NFL Season: An EXTREMELY Observational Analysis

In a past post, Guy mentioned his discontent with using RSD as a measure of balance at the 9-game point in the season for the NFL. If you think back to the post-Patriots crushing of the Titans coverage, announcers were claiming that the NFL was out of whack and the balance in the game was decreasing. I suggested that probably wasn't the case to any significant extent, and it seems like the end of the season played out pretty evenly (I can't even count how many teams were in the AFC Wild Card hunt). I decided to put together some random facts about this season vs. past seasons in NFL in order to see if there really is a balance change going on.

I started with the average points per game (per team) from 2005 to 2009. Why? Just to get an idea of the scoring environment in each season:

2005: 21.48
2006: 20.66
2007: 21.69
2008: 22.03
2009: 21.47

Of course, this doesn't tell us anything about balance. I was actually surprised to see the very little difference between these seasons. What about point differentials. First, I took the standard deviation of the difference in and Points For and Points Against across teams for each season (the mean is, of course, ZERO):

2005: 91.39
2006: 92.53
2007: 116.32
2008: 107.21
2009: 119.31

2005 and 2006 definitely stick out here. As for 2009 compared to the previous 2 seasons, I don't see anything special. To add to this, I calculated the mean and standard deviation of point differentails using the individual game scores, rather than team totals (Means with SD is in parentheses):

2005: 11.6 (9.61)
2006: 11.43 (8.81)
2007: 12.47 (9.50)
2008: 12.22 (9.54)
2009: 12.97 (10.37)

I didn't run any statistical tests on this small sample; however, 2009 definitely does stick out somewhat here not only in the mean score differential, but also the variability we saw across the season. I'm not sure how to do this myself, but I'm curious if this is a product of uneven competition, or style of play in football. Could it be that the pass first mentality is increasing, while running is used only after a lead? If this is the case, I think we would expect teams with early large leads to maintain those leads through the end of the game, then run out the clock but not score later, leaving us with large gaps in scoring, but not much difference in the average score per team per season. I'm just thinking aloud here, and I'm not really sure how things would play out in the results for the season. Next, I counted the number of games that were decided by two touchdowns or more over the time period:

2005: 75
2006: 76
2007: 86
2008: 88
2009: 87

I followed this up by simply counting the number of games where the winning team scored more than 28 points (or more than 4 touchdowns+XP), and then the number of games in which the losing team scored less than 14 points (or less than 2 touchdowns+XP).

Greater Than 28 Points for Winners (losers)

2005: 89 (9)
2006: 92 (12)
2007: 116 (14)
2008: 123 (17)
2009: 112 (13)

Less Than 14 Points for Losers (winners):

2005: 117 (30)
2006: 117 (28)
2007: 109 (29)
2008: 105 (26)
2009: 115 (22)

I see a couple possible anecdotal things from this. First, it seems to have been increasingly difficult to win if you do not score 14 points or more. At the same time, many more winning teams are scoring more than 4 touchdowns than in 2005/2006. Here's an aggregate of the two, where I counted the number of games that had both a >28 point scoring winner and a less than 14 point scoring loser:

2005: 30
2006: 28
2007: 36
2008: 44
2009: 45

So, as expected, we see the increasing gap that we saw from before. This is all just some observational data, and I'm not trying to make any conclusions from it. But right now, it doesn't seem like there should be significant concern for this season, but that some sort of structural change occurred after the 2006 season. Below is the Standard Deviation of Win Percents for the NFL:

2005: 0.193
2006: 0.181
2007: 0.208
2008: 0.207
2009: 0.201
Again, 2007 seems to be the first season in which we see a significant change in the apparent balance of the league. So if announcers wanted to blow their whistles, perhaps that would have been a better time to do so. I'm still not all that worried, though. Perhaps it's simply a change in the tails of our team talent distribution. We could evaluate this using the expected tail probabilities if all teams were of equal strength...or the probability that we would see teams outside 2 Standard Deviations (in this case, below 0.250 and above 0.750) from the mean win percent (0.500, of course). Below I list the percentage of teams we see outside the expected if the league were perfectly balanced:

2005: 12.5%
2006: 15.6%
2007: 18.8%
2008: 12.5%
2009: 18.8%

It's tough to decipher anything here, but it looks like we have more in the tails than we should. For perspective, we would expect about 4.6% of the teams to be in the tails based on random chance alone in a perfectly balanced league. Of course, these numbers are affected by the number of games played head-to-head, but in an relative sense, I think they are interesting to look at.

The way I see it, one of three things could be happening:

1) The overall parity in the league decreased in 2007 and has been similar since. 2007 was the Patriots' big year, so we would probably expect to see something going on that year.

2) The structure of the game has changed and when teams are way ahead, they are much more able to keep that spread (and this may not matter which team is ahead). In addition, given the high variance, teams that are in close games are more likely to stay close. Perhaps if the Titans were up by 28 on the Patriots in the first quarter, they would have continued scoring, while the Patriots would not, even given the same talent. Why would this happen? I don't know. But it is a possibility.

3) Nothing has happened. Given the small sample I show here, the changes could be simply random in the scheme of a larger sample of NFL history.

In the end, this little look is not nearly enough data here to really make any conclusions. I'm short on time, but anyone willing to look further back would be welcome to do so (I invite anyone interested to look, and possibly add more comparisons of different measures). However, I think it's important to note that if announcers wanted to raise significant concern over the balance in NFL, they should have done it when the Patriots went 16-0 in 2007.
For reference, below are the end-season RSD's for the years in question. These really don't raise much concern with me over the general dispersion of winning for 2009. Again, we see a slight change in 2007, but over the last 3 years, RSD is lowest (more balanced W%) than the previous 2 seasons. Perhaps some of it is a result of the Colts losing their last 2 games. But it's tough to really make a conclusion about whether or not they definitely would have won them, so I'll leave it as is.
2005: 1.540 (different from what I had before, but this seems to be correct)
2006: 1.447
2007: 1.661
2008: 1.658
2009: 1.612

Tuesday, January 12, 2010

Fantasy Ball Junkie: Fantasy Balance and League Policy

I have a new post over at FBJ discussing the use of competitive balance metrics in fantasy to evaluate changes in league rules. The example talks specifically about dumping, but it can really be applied to any league policy change that's supposed to support parity and competition. I'll repeat here, as I did in the comments at FBJ, be careful of small sample sizes. Before making a decision on changing a rule, make sure you have sufficient evidence that it didn't help things.

The site has begun to pick up, and our editor has put up some interesting visualizations. It's something I had hoped to do here, but don't have the time (and, ultimately, after seeing Beyond the Boxscore's "Diamond View" graphics, I have no clue how to compete). Let the fantasy keeper season begin!

Thursday, January 7, 2010

A Non-Sports (But Internet Discussion-Related) Link

Over at ECONJEFF, Jeff Smith has posted my now favorite comic of all time. I don't have much to comment on this, but wanted to link away. Busy day today, with the first day of some classes, prospective faculty lunches, and presentations to attend.

Tuesday, January 5, 2010

End-Season Effort in the NFL and Other Interesting Things I Missed Commenting on During My Break

I'm still coming back from my break, so I don't have any sort of fully organized post on this topic. I have some links and short comments for now. I hope everyone had a great holiday season. It's football playoff time, which is always exciting. Unfortunately, that means we're closer to the depressing lull of post-football/pre-baseball season where the sports and fantasy front is as bleak as ever (unless you're in a fun keeper league).

1. After the losses by the Indianapolis Colts (royally pissing off their fans when Manning was benched), the NFL announced that they will be looking into ways to incent effort at the end of the NFL season, despite already clinching a division. This is an extremely interesting problem. However, I don't think there's any long-term reason to worry about this. NFL teams (especially good ones) sell just about all of their tickets before the season. I don't think fans are going to be upset enough from one or two meaningless games to the point that they no longer purchase the season tickets IF the team is winning the division (and also winning in the playoffs). So from the home fan perspective, I think it's a pretty minor worry.

I've seen some interesting 'solutions' to this problem, and I am all for trying to see if a team will go 16-0, or continue to be competitive in the season. Obviously, this type of play has effects on other teams in the league and can affect playoff positions disproportionately. THIS is where the real problem lies for the NFL. Some have suggested allowing for an extra home game the following season if there is a Week 17 win this season. While this very likely could incent the teams to win in Week 17, it would have drastic effects on the following season. The NFL prides itself in balancing it's schedule from one season to the next. If you're a better team, you have a tougher schedule the following year. Home Field is a huge advantage in NFL, and given that the most likely winner of Week 17 is the better team, this would seem to have adverse effects on what the NFL wants in terms of competitive balance. On average, the better teams would be getting more home games, making it even tougher for the lower level teams to catch up. I'm curious what the NFL has in mind though. It's tough to 'make' teams risk injury and future wins/significant revenue so that a Week 17 game could be competitive. I think the real effects on fan interest in NFL are--at best--marginally affected by these games in the scheme of things.

2. Mike Leach sure got himself into a pickle with his troublemaking, son-of-an-analyst player down at Texas Tech. I think this was a complete screw up on both sides, but I'm pretty convinced Texas Tech administration had it out for Leach to begin with. The concussion is really irrelevant to the story in my opinion (also agreeing is Skip Sauer at The Sports Economist). The mention of it over and over again is just a red herring. The kid was in the 'training and equipment room' and then the 'film room'. Essentially, Leach put him in the corner just like your parents did when you were bad as a kid. James' father handled this like an common pushy parent, and I'm awfully surprised ESPN let him air a statement during a bowl game last week. It's amazing at how childish parents can act when it comes to sports (I experienced it when I coached High School Baseball for just one year!). Leach will find a job somewhere, and he'll probably be paid accordingly. He should consider himself lucky he doesn't have to live in Lubbock, Texas anymore. I feel like he is in line for some money through the lawsuit he files against TT as well. His behavior was a bit pompous, but I have no doubt that James was not emotionally damaged by the incident. He was on the sidelines of their bowl game smiling and conversing with some other players.

3. Boise State beat TCU in the "Other National Championship Game". In all seriousness, it was a great game, and just about what I expected. If only Cincinnati and Texas had lost that last week! I think Texas would have some trouble with either of the Fiesta Bowl teams. In the end, the BCS got the National Championship right, which should be a great defensive game that will be a lot closer than most seem to think. Then again, I thought last year's game would total over 85 combined points, so what do I know.