Tuesday, May 25, 2010

Testing Out my Pitch F/X Data

I recently got all the Pitch F/X data downloaded from Gameday, and have been fiddling around. I certainly don't have the physics knowledge to really talk about the movement at this point, and I'm still acquainting myself with the data format and what everything is telling me. But I figured I'd test out some plots of Johnny Cueto's outing on May 11th this year.
Much thanks to Dr. Alan Nathan, Brooks Baseball, and especially Mike Fast for providing so much information on the Gameday database. I'm still working on figuring out how to parse the data with Mike's script with XAMPP. If anyone has any experience with this, please let me know. I have the database set up already in SQL, I just need to get the PHPAdmin to link everything...or whatever it does. MySQL is easy enough (and from reading some of Mike's stuff, works great with R), but I'm not a computer whiz so I don't really know how all these things work together.
The first plot below is the called balls and strikes for Cueto, as well as the pitch location for each call. Red is a strike, black is a ball. Pretty straight forward. Looks like he got at least one call way inside on a right-hander (from my reading, this is the catcher/umpire POV). The strike zone height is the average strike zone for all the batters faced in the game, while the width is simply the width of the plate. Pretty straight forward.

So how about pitches? Let's check out what he's been throwing (based on the Gameday classification, which some have said isn't the best yet). Plus signs are fastballs, X's are two-seamers, circles are changeups, diamonds are sliders. Looks like he got the calls outside the zone on his two-seamer...not surprising since a right hander would tail that way after beginning closer to the edge of the plate. Obviously, this is a small sample size to make that conclusion though.

Anyway, just playing around here. Hopefully I can put together some fun visualizations for fantays and perhaps FBJ in the future.

Monday, May 24, 2010

RIP Jose Lima: Forever Living in Fantasy Owners' Hearts

It's been reported that former MLB pitcher Jose Lima died of a heart attack Sunday at only 37 years old. While he didn't have a Hall of Fame career, Lima will always be remembered for a few reasons. First, everyone simply loves to say "Lima Time!". The guy was intense out there, and that's always fun to watch. He had a career losing record, despite winning 21 games in 1999 after putting up 16 in 1998.

Fantasy players will forever remember Jose Lima if for no other reason than the popular strategy named for him: LIMA (Low Investment Mound Aces). It's interesting that this year the LIMA strategy seems have been a good bet to currently sit atop your fantasy league. With pitchers like Phil Hughes, Shaun Marcum, CJ Wilson, Colby Lewis, and others dominating the Major Leagues, the strategic salute to Jose Lima's monster 1999 season should prove to be a long-lived one.

Many players have had nice things to say about Lima as a mentor as well, and that's an even more important reason to remember the colorful pitcher. It's a shame that someone who everyone loved watching out on the mound, and someone who provided support for his teammates and colleagues, had to go so early. I send my condolences to his family.

Thursday, May 20, 2010

The Pontiac Silverdome and Other Links

So, this is a little late, but it looks like the Pontiac Silverdome will host 8 events this year. The ownership is still in question, as the auction process that sold the arena(stadium) for 1% of its 1975 construction price seems to be done in quite a sketchy manner. But the Canadian group that owns it is at least doing something with it. Given the needed upgrades and $1.5 million yearly costs of just keeping the thing standing, I'm not sure how they plan on making a profit through Monster Truck Rallies and some concert by an Indian film composer (huh?). But I guess they have some plans. The place has been an albatross for Michigan before its firesale, so at least it's on these Canadians if the thing goes broke now. Still, the city got peanuts for the land and stadium. Seems like that land could be put to better use if they tore the thing down.

Lance Armstrong has again been accused of using performance enhancing drugs by Floyd Landis. Landis is the epitome of a massive prick, and that's coming from me (someone who really doesn't care about steroid use in sports). To all of this, I say: Yeah...probably true. There's plenty of reason to believe Armstrong has used PEDs during his career, beginnig with the fact that pretty much everyone does it in cycling. That doesn't mean he DID use them, and it's difficult to prove a negative, but I have my doubts. I had the chance to meet in person one of Armstrong's former teammates, and he admitted his own PED use (he had already done this pulicly). Since a lot of people consider the cycling a sort of team sport, shouldn't that nullify Armstrongs wins anyway? I mean really. Why isn't this discussed at all?

Friday, May 14, 2010

Fantasy Teams Update

So, this is mostly for myself (so I know where I stand at this point in the season when I look back on it and the moves I make). I would suggest not reading it, but it's nice to know my team looks very good in the Razzball Commentor League Standings as of today:

Sitting at the 7 Spot isn't too bad, considering I have Brad Hawpe sitting around on my DL, and Prince Fielder and Ben Zobrist haven't even started hitting yet.

Anyway, here you go, me.

FBAL Keeper League (20-team Auction Keeper):
1st in Division at 21-11, up 10-6 with a chance to win big this session.

H2H Points Private Pay League (12-team Snake):
1st in Record (6-0), 1st in Points, and 1st in Breakdown Record. Currently winning this session 300-248.

Razzball Commentor League (12-team Snake):
1st place with 95.5 Roto Points (2nd Place has 74)

12-Team Public ESPN (Auction):
1st Place with 89.5 Roto Points (2nd Place has 76.5)

10-Team Public ESPN Champions League "Winners Circle" (Auction):
3rd Place with 68.5 Roto Points (1st Place has 74.5)

10-Team Public ESPN Champions League "Los Angeles" (Snake):
1st Place with 68 Roto Points (2nd Place has 66.5)

10-Team Public ESPN Champions League "Colorado" (Auction):
1st Place with 90 Roto Points (2nd Place has 64.5)

10-Team Public ESPN H2H Categories League (Snake):
Tried an all-reliever strategy here, but have run into some bad luck with hitting injuries (Nelson Cruz and Jimmy Rollins), as well as bad matchups (losing WHIP and ERA categories to sub-1.0 totals by the other team!). Should pick up soon, and public league waiver wires are pretty easy to crush:
3rd in Division at 23-23-4 (1st in Division is 29-17-4)

10-Team Public CBS Roto Leauge (Snake):
Had a slow start, but now 3rd Place with 66 Roto Points (1st Place has 70)

10-Team Public CBS Roto Plus League (Snake):
This ended up being an autodraft team, as I had plans randomly come up. Started out terribly hitting-wise (as I showed in a previous post). Been steadily improving:
3rd Place with 53 Points (1st Place has 61.5)

So, there we have it. Pretty solid start to the season, and I'd argue that my hitters on almost all teams are underperforming right now...but then again I'm biased toward who I like. Adam Lind and Josh Beckett are killing me in my Keeper League, but I've been able to hold on thus far.

Friday, May 7, 2010

FBJ Trading for Sluggers in H2H Post 2

Just linking any readers over to Fantasy Ball Junkie and my latest post where I simulate the expected increase in W% for teams trading up for a better slugger. We actually think this might be on the verge of a new valuation system for H2H leagues, so stay tuned. First, I'll have to go through the rest of the categories to get an idea of how they vary in H2H leagues (and how it changes with different league sizes, etc.). It was really a lot of fun to put together, and I'm not sure anyone has done this before for fantasy. And who doesn't love posting as many R-plots as possible!?!

Anyway, check it out.

Wednesday, May 5, 2010

New Stuff on the Sidebar

I just wanted to refer anyone who frequents this site (that's you, Mom) that I added a few new websites to my sidebar.

The first is called Rational Pastime (which is now my favorite baseball blog name). It's in a similar vein (vane, vain?) to which I began this blog. Unfortunately, I haven't been able to keep up with analyzing news/economic stuff with sport. Rational Pastime does a good job of it, and I hope to start up again now that it's summertime.

The second link is It's All About the Money, Stupid. Certainly not a new blog, but I enjoy the posts over there and do not want to leave it out.

Finally is a blog I had never heard of, but certainly should have. It's called "It's a Swing and a Miss". The blog is run by Kristi Dosh, who is apparently a baseball expert. And a "Miss". She has some refreshing looks at baseball and law, and is certainly holding her own in what has been a man's world. Great stuff over there.

Fun with R: Clustering and MDS

I've seen K-means clustering, PCA, etc. done some over at Beyond the Boxscore and Baseball Analysts (and the now defunct Statspeak), but I thought I'd just check out some clustering on the young fantasy season using the traditional 5x5 categories with some visualizations. I use a Multidimensional Scaling approach to visualize the data, with colors representing each group. Nothing breakthrough here, just feeding my intellectual curiosity. Using a simple k-means clustering with 3 groups (for about 170 players with a reasonably large number of At Bats), I got the following averages for each group (through May 2), with my description of each group below (please excuse the crappy Blogger formatting):


1: 10.5/2.7/11.9/1.1/.274
2: 13.2/1.0/6.7/4.6/.270
3: 16.8/5.4/18.0/1.8/.300
All: 13.0/3.1/12.5/2.1/.281

Group 1 (Black): Average players in Power/RBI without much Speed or Run Scoring
Group 2 (Red): Players that get value through SB/Run Scoring
Group 3 (Green): Top Tier players in the first month with HR/RBI potential

(click on the image to enlarge it)

So, looking at the plot, it seems that going from the bottom left to the top right orders the overall 5x5 fantasy contribution of each player (left to right, the x-axis is correlated with AVG as well, but not as distinctly as HR and RBI). Like I said, nothing surprising here, but fun to look at. Don't worry about the number scale on each axis, as it really means nothing practical other than what I've described already.

So what anomolies do we find? Well, the Prince Fielder and Geovany Soto positions are interesting. Fielder is off to a slow start, so it's not surprising to find him toward the left of the Power/RBI scale. He scores runs in his lineup at a decent rate, which keeps him from being at the bottom of the Run/SB scale. On the other hand, Soto has started hot, but his position toward the bottom of the Cubs lineup hasn't provided him with much opportunity for RBIs. With his lack of speed, he ends up right around the middle as well. Shane Victorino is in a strange place given his skill set, but remember that he's already hit 5 HR this year!

Not surprisingly, the starts of Jason Kendall, A.J. Pierzynski, and Alexei Ramirez put them at the bottom left of the plot. We also find Carlos Lee in the speed group. This is a product of his lack of power to start the season, with 0 HR and 5 RBI. He has stolen a base, however, which keeps him from the very bottom of the y-axis scale.

Just a quick intro to these techniques, as they're a lot of fun to play with when it comes to baseball statistics. Probably, using PCA and a 'Biplot' would be more isntructive than MDS, but it did a great job of showing the separated clusters. We could always try to cluster more groups, depending on how we think it should be done, but I think 3 does the job pretty well.

On another note, I have the first part of an article up at FBJ that describes the variability in HR hitting for different types of players, and how we can utilize this to understand trade value in H2H leagues that count categories. Gotta love implementing R in fantasy articles :-)