Below is a quick table of umpires that were behind the plate for at least 5,000 plate appearances from 2007 through 2010 (for which Pitch F/X data is available). From the looks of things, the umpire can have just over a two-run effect on the outcome of the game due to his strike zone (ADDENDUM: MGL correctly points out in the comments that my language is imprecise, and the assumption that the noise is evened out is too strong. I agree he is correct. I should have said that the difference in the data is a bit over 2 runs, NOT that the EFFECT was a little over 2 runs. His suggestion is that the effect is about 0.6 runs. I'll see what other info I can get out of the data.). Of course, we're assuming that umpires are randomly assigned and that the quality of the pitching and hitting evens out over the 5,000 plate appearances, which is a pretty strong assumption. But even if the range of the effect was only a single run, I think this would be pretty significant. The data below is for 2007 through 2010.
| Umpire First Name | Umpire Last Name | Games | PA | Strikeout % | OBP | SLG | AVG | Runs Per Game |
| Jerry | Crawford | 87 | 6834 | 16.56% | 0.3459 | 0.4268 | 0.2639 | 10.17 |
| Angel | Campos | 84 | 6466 | 18.33% | 0.3361 | 0.4191 | 0.2658 | 9.92 |
| Gerry | Davis | 140 | 10822 | 16.68% | 0.3354 | 0.4250 | 0.2635 | 9.89 |
| Tim | Welke | 127 | 9755 | 18.60% | 0.3324 | 0.4215 | 0.2636 | 9.83 |
| Chad | Fairchild | 131 | 10262 | 18.01% | 0.3343 | 0.4181 | 0.2617 | 9.82 |
| Jim | Reynolds | 130 | 10079 | 18.25% | 0.3372 | 0.4234 | 0.2690 | 9.74 |
| Tim | McClelland | 144 | 11090 | 16.47% | 0.3418 | 0.4168 | 0.2660 | 9.72 |
| Tim | Tschida | 135 | 10528 | 17.31% | 0.3413 | 0.4167 | 0.2678 | 9.69 |
| Larry | Vanover | 133 | 10202 | 17.70% | 0.3312 | 0.4153 | 0.2617 | 9.68 |
| Sam | Holbrook | 139 | 10618 | 17.48% | 0.3345 | 0.4280 | 0.2628 | 9.68 |
| Bill | Welke | 132 | 10248 | 17.94% | 0.3357 | 0.4153 | 0.2690 | 9.64 |
| Mike | Reilly | 138 | 10705 | 17.91% | 0.3410 | 0.4241 | 0.2666 | 9.62 |
| Randy | Marsh | 93 | 7050 | 15.26% | 0.3435 | 0.4173 | 0.2671 | 9.52 |
| Alfonso | Marquez | 103 | 8103 | 16.46% | 0.3380 | 0.4093 | 0.2609 | 9.50 |
| Scott | Barry | 110 | 8366 | 16.91% | 0.3365 | 0.4206 | 0.2608 | 9.48 |
| Tim | Timmons | 134 | 10349 | 17.61% | 0.3314 | 0.4173 | 0.2650 | 9.48 |
| Paul | Schrieber | 110 | 8678 | 16.73% | 0.3450 | 0.4100 | 0.2610 | 9.46 |
| Brian | Knight | 128 | 9760 | 17.01% | 0.3368 | 0.4200 | 0.2646 | 9.46 |
| Jerry | Meals | 138 | 10596 | 17.57% | 0.3322 | 0.4190 | 0.2617 | 9.44 |
| Adrian | Johnson | 120 | 9338 | 17.56% | 0.3376 | 0.4147 | 0.2601 | 9.39 |
| Dana | DeMuth | 141 | 10871 | 17.66% | 0.3330 | 0.4060 | 0.2599 | 9.38 |
| Brian | Gorman | 139 | 10599 | 17.81% | 0.3312 | 0.4233 | 0.2657 | 9.37 |
| CB | Bucknor | 138 | 10771 | 17.44% | 0.3361 | 0.4121 | 0.2669 | 9.34 |
| Chuck | Meriwether | 105 | 8079 | 17.45% | 0.3296 | 0.4058 | 0.2608 | 9.31 |
| Ed | Hickox | 105 | 7955 | 17.88% | 0.3243 | 0.3943 | 0.2513 | 9.31 |
| Eric | Cooper | 133 | 10174 | 17.75% | 0.3293 | 0.4119 | 0.2643 | 9.31 |
| Tony | Randazzo | 102 | 7881 | 17.64% | 0.3283 | 0.4246 | 0.2646 | 9.29 |
| Marvin | Hudson | 136 | 10702 | 17.81% | 0.3336 | 0.4028 | 0.2592 | 9.29 |
| Charlie | Reliford | 75 | 5699 | 17.70% | 0.3226 | 0.3980 | 0.2558 | 9.24 |
| Wally | Bell | 142 | 10937 | 18.20% | 0.3274 | 0.4198 | 0.2593 | 9.24 |
| Lance | Barksdale | 139 | 10545 | 17.52% | 0.3323 | 0.4062 | 0.2552 | 9.24 |
| Greg | Gibson | 135 | 10583 | 17.00% | 0.3311 | 0.4046 | 0.2568 | 9.23 |
| John | Hirschbeck | 81 | 6167 | 17.97% | 0.3256 | 0.4106 | 0.2585 | 9.21 |
| Dan | Iassogna | 138 | 10521 | 18.40% | 0.3345 | 0.4112 | 0.2609 | 9.20 |
| Todd | Tichenor | 85 | 6480 | 17.02% | 0.3375 | 0.4040 | 0.2628 | 9.19 |
| Derryl | Cousins | 139 | 10809 | 17.73% | 0.3262 | 0.3952 | 0.2496 | 9.18 |
| James | Hoye | 147 | 11464 | 17.81% | 0.3295 | 0.4014 | 0.2572 | 9.15 |
| Joe | West | 142 | 11016 | 17.27% | 0.3281 | 0.4067 | 0.2538 | 9.14 |
| Jim | Joyce | 131 | 10070 | 16.74% | 0.3341 | 0.4036 | 0.2599 | 9.14 |
| Dale | Scott | 142 | 10816 | 18.14% | 0.3325 | 0.4143 | 0.2623 | 9.13 |
| Marty | Foster | 121 | 9343 | 18.41% | 0.3285 | 0.4101 | 0.2584 | 9.12 |
| Ted | Barrett | 141 | 10802 | 17.79% | 0.3263 | 0.4078 | 0.2568 | 9.11 |
| Mike | Everitt | 143 | 11021 | 18.10% | 0.3279 | 0.4114 | 0.2569 | 9.09 |
| Kerwin | Danley | 109 | 8248 | 17.34% | 0.3359 | 0.4069 | 0.2633 | 9.08 |
| Fieldin | Culbreth | 142 | 10848 | 17.16% | 0.3311 | 0.4175 | 0.2603 | 9.01 |
| Tom | Hallion | 138 | 10428 | 18.37% | 0.3251 | 0.4121 | 0.2561 | 9.01 |
| Brian | Runge | 120 | 9048 | 18.39% | 0.3238 | 0.4149 | 0.2590 | 8.99 |
| Laz | Diaz | 139 | 10683 | 18.41% | 0.3234 | 0.4069 | 0.2560 | 8.99 |
| Bruce | Dreckman | 123 | 9573 | 17.05% | 0.3290 | 0.4013 | 0.2579 | 8.98 |
| Paul | Nauert | 137 | 10471 | 17.85% | 0.3262 | 0.4146 | 0.2602 | 8.98 |
| Gary | Darling | 131 | 9874 | 18.14% | 0.3289 | 0.4100 | 0.2621 | 8.96 |
| Mike | DiMuro | 109 | 8386 | 18.28% | 0.3219 | 0.3997 | 0.2515 | 8.95 |
| Mark | Wegner | 133 | 10173 | 18.34% | 0.3279 | 0.3991 | 0.2518 | 8.94 |
| Phil | Cuzzi | 138 | 10492 | 18.76% | 0.3252 | 0.4067 | 0.2582 | 8.93 |
| Angel | Hernandez | 141 | 10650 | 17.29% | 0.3279 | 0.3962 | 0.2557 | 8.90 |
| Ed | Rapuano | 140 | 10689 | 17.55% | 0.3293 | 0.4072 | 0.2579 | 8.89 |
| Bob | Davidson | 140 | 10803 | 17.40% | 0.3307 | 0.3924 | 0.2576 | 8.86 |
| Mike | Winters | 133 | 9904 | 18.35% | 0.3302 | 0.4070 | 0.2620 | 8.86 |
| Rob | Drake | 146 | 11091 | 18.86% | 0.3231 | 0.4019 | 0.2515 | 8.85 |
| Jim | Wolf | 133 | 10133 | 18.01% | 0.3313 | 0.4078 | 0.2604 | 8.83 |
| Hunter | Wendelstedt | 140 | 10625 | 17.37% | 0.3258 | 0.4021 | 0.2558 | 8.81 |
| Bill | Miller | 142 | 10852 | 18.69% | 0.3186 | 0.4026 | 0.2534 | 8.77 |
| Brian | O'Nora | 124 | 9305 | 17.69% | 0.3221 | 0.4100 | 0.2571 | 8.77 |
| Ron | Kulpa | 130 | 10016 | 18.24% | 0.3286 | 0.4033 | 0.2578 | 8.76 |
| Jerry | Layne | 118 | 9071 | 17.43% | 0.3313 | 0.4008 | 0.2525 | 8.71 |
| Mark | Carlson | 107 | 7971 | 18.15% | 0.3266 | 0.4053 | 0.2565 | 8.67 |
| Jeff | Kellogg | 143 | 10784 | 17.23% | 0.3291 | 0.4101 | 0.2563 | 8.66 |
| Paul | Emmel | 134 | 10107 | 18.77% | 0.3195 | 0.3924 | 0.2537 | 8.65 |
| Chris | Guccione | 148 | 11205 | 17.72% | 0.3303 | 0.3999 | 0.2578 | 8.64 |
| Jeff | Nelson | 123 | 9399 | 18.24% | 0.3248 | 0.3997 | 0.2523 | 8.63 |
| Gary | Cederstrom | 138 | 10387 | 18.07% | 0.3292 | 0.4031 | 0.2583 | 8.62 |
| Doug | Eddings | 140 | 10530 | 18.64% | 0.3237 | 0.4112 | 0.2596 | 8.56 |
| Andy | Fletcher | 117 | 8930 | 18.91% | 0.3221 | 0.3852 | 0.2491 | 8.20 |
| Mike | Estabrook | 83 | 6265 | 18.13% | 0.3200 | 0.3848 | 0.2559 | 7.95 |
| Bill | Hohn | 91 | 6618 | 16.88% | 0.3234 | 0.3965 | 0.2505 | 7.91 |
Anyway, Jeff's post was more about strike calling percentage than anything else. His tables seem strange, and if they're telling me what I think they're telling me, then I don't think they're correctly. For example, of all pitches called strikes by the umpire in 2010, I have about 65% of those falling within the RULEBOOK strike zone (that means the edges of the plate, NOT the 2-foot wide zone commonly used for the zone).
PRELIMINARY DATA HAS BEEN REMOVED BECAUSE I'VE SEEN IT ABUSED IN CERTAIN PLACES. PLEASE SEE LATEST VERSION OF DATABASE!
Below, I show a table of a number of things. The first 3 columns show the percentage of pitches within the rulebook strike zone CORRECTLY called a strike. Similarly, the next 3 columns show the percentage that each umpire CORRECTLY calls a ball when it is truly outside the strike zone. I do this for all batters, RHB, and then LHB.
Next, I also tally up the INCORRECT ball and strike calls. So these are the percentages that each umpire calls a Strike on a pitch that is actually OUTSIDE the rulebook zone OR calls a Ball on a pitch that is truly WITHIN the rulebook zone. Again, keep in mind I use the rulebook zone, rather than the standard 2-foot wide zone:
PRELIMINARY DATA HAS BEEN REMOVED BECAUSE I'VE SEEN IT ABUSED IN CERTAIN PLACES. PLEASE SEE LATEST VERSION OF DATABASE!
I was in the process of also recording the total number of pitches called by each umpire to put it in perspective, but did not have time before posting this. I'll add that stuff later on. I think it's pretty obvious that Barrett doesn't have a perfect call percentage with LHB up to bat.
Anyway, I'll have more on this later. For now, look at the zones below from 2010 for all of the umpires in video format (yeah, yeah, I re-posted it but it sure makes sense to have it in this post as well).
NOTE: I fixed the videos. I was made aware that no one could see them because of Facebook privacy settings. Please let me know if there is still a problem. DUH!
Another Update: I added pitch counts for 2010 to the data tables above as to keep from making big conclusions with small sample sizes. When comparing RHB to LHB, remember that it's pretty common to have the LHB zone shifted outside. Because I have used the BOOK ZONE to gauge 'correctness' of the call, these will be skewed a bit. Also, I am working on getting the tables a bit more manageable for Blogger, which continues to disappoint me with its formatting capabilities.
It looks like your Incorrect Strike % is wrong, its the correct strike %, but other than that this is very useful.
ReplyDeleteI particularly like the incorrect strike and incorrect ball table.
Going back to Jeff Zimmerman's post, from the graphs it looks like the high inside pitch is almost never called as a strike. I think they should try to train for that tendency.
Thanks, Kazinski. Looks like I merged the wrong column in my table. I'll fix it up.
ReplyDeleteIs all the data in all the charts from 07-10?
ReplyDeleteMGL
"From the looks of things, the umpire can have just over a two-run effect on the outcome of the game due to his strike zone."
ReplyDeleteThat is not even close to being true. Most of what you are seeing in the runs per game column is noise.
The actual differences in rpg between the most hitter and pitcher friendly umpires is around .6. I would guess that 1 SD of umpire "rpg" due to their strike zone is .25.
MGL
MGL,
ReplyDeleteThe first table is 2007 through 2010 for those games which have Pitch F/X data (I'm using my database to calculate these).
However, for the Pitch-Level data, it is only 2010. Also keep in mind they aren't projections, they're just cross-tabs for the umpires.
And sorry for the shitty formatting. For a closer inspection, you might want to copy and past the tables into Excel.
Like I said, the "correct" and "incorrect" designation should be taken with a grain of salt, as it's simply the book zone, which we know isn't called for the most part by the umpires, and it extends a bit off the plate on both sides.
Fair enough, and I agree there is plenty of noise. It's a strong statement, and I was mainly just referring to the data as given. I'll take a look at that stuff, too.
ReplyDeleteIn my defense, I state that the random 'evening out' assumption is extremely strong.
ReplyDeleteAny chance you can post the pitch-level data for 07 and 09 as well? Thanks.
ReplyDeleteMGL
I'll try and get the 2007 to 2009 pitch data handy today or tomorrow. And for anyone reading this, if you see anything fishy with the data let me know so I can fix any mistakes with the calculations.
ReplyDeleteWhy don't the percentages add up for correct/incorrect strike percent? For example, Bob Davidson had a correct strike percent of 89.92%, but an incorrect strike percent of 17.40%. These tables are both data from 2010, correct?
ReplyDeleteThey're conditional on the location of the pitch (inside or outside the zone). Sorry, it's a little confusing. So here's what they're showing:
ReplyDeleteCorrect Strike %: Percentage of balls within the zone called a strike.
Incorrect Strike %: Percentage of balls outside the zone called a strike.
Correct Ball %: Percentage of balls outside the zone called a ball.
Incorrect Ball %: Percentage of balls within the zone called a ball.
We should expect that Incorrect Ball % + Correct Strike % = 100%.
Bob Davidson's Correct Strike % is 89.92% while his Incorrect Ball % is 10.08%. So this accounts for 100% of the called pitches within the strike zone.
Let me try and think of a more logical way to title the columns and/or add in crosstabs of correct strike calls vs. incorrect strike calls.
Alright, thanks. I was just adding the wrong numbers together. That makes more sense.
ReplyDeleteI'm double checking the calculations on the opposite of what I explain above though: the pitches outside the zone.
ReplyDeleteFor some reason, some of the umpires aren't adding up to exactly 100% (i.e. Incorrect Strike % + Correct Ball %). Almost all the umps seem to be short about 2% of 100% when I add these together. I'll try to figure this issue out and fix it. They should be very close though.
Figured out the problem with the latter categories.
ReplyDeleteWhen adding Incorrect Strike % and Correct Ball % keep in mind that this does not account for Intentional Balls or Pitch Outs.
I think leaving things as is makes more sense anyway. Gauging umpire zones and performance on these sorts of pitches would seem to be misleading, as I assume they get them all right.
The logical conclusion to your data is to determine some sort of sensitivity and specificity for each umpire, e.g.
ReplyDeleteSen=CS/(CS+IB) % of times a pitch is correctly called when it is in the zone.
Spec=CB/(CB+IS) % of time a pitch is correctly called when it is out of the zone
An ump with a low sensitivity would call a lot of strikes balls, i.e. be batter friendly.
An ump with a low specificity would call a lot of balls strikes, i.e. be pitcher friendly.
An ump with high numbers would not favor either the pitcher or the batter.
EP, that is what the data is saying (though, the columns are not particularly well-labeled, which I'm working on fixing up).
ReplyDeleteThe data is already in the Sensitivity/Specificity form. In the tables, Correct Strike % is what you describe as Sensitivity, while Correct Ball % is the Specificity.
"I'll try and get the 2007 to 2009 pitch data handy today or tomorrow. And for anyone reading this, if you see anything fishy with the data let me know so I can fix any mistakes with the calculations."
ReplyDeleteGreat, looking forward to it!
MGL
Any progress on updating the data, Millsy?
ReplyDeleteMGL
Already posted on Saturday. In a following blog post as an Excel file (too much to stick up here).
ReplyDelete