Wednesday, March 7, 2012

Strike Zone Changes?

It's been a while since I have posted here. I have been swamped with some papers I am trying to get out, finishing up the dissertation, and interviews (faculty ones in addition to others). I should have some big news in the next couple of weeks regarding this last activity. But something spurred me to take a break from this process and post today.

The Book Blog recently provided a link to Baseball Analytics claiming a huge increase in strike calling by umpires over the past 4 years. However, this is a somewhat questionable finding. Before reading on, go there and check out the discussion already taking place. Then come back.

I also ran into this "huge discovery" earlier last year when trying to come up with a topic for my Baseball ProGUESTus post. I spoke about the finding with Mike Fast, and we quickly realized (more Mike than I) that, rather than the umpires becoming more accurate within the strike zone, it was the stringers creating the "sz_top" and "sz_bot" measurements who were actually getting better over time. I talked briefly about this problem HERE, and temporarily took down my original umpire strike zone calculations because of that. As an example of the bias, below I have the strike rate of pitches within the rulebook zone using a fixed zone and the stringer-provided top and bottom of the zone numbers:

Fixed Zone:
2008: 86.08%
2009: 85.90%
2010: 86.78%

Stringer-Provided Top & Bottom:
2008: 79.99%
2009: 82.20%
2010: 85.55%


From the above fixed zone, we'd expect about a 0.034 to 0.060 increase in run scoring per game (or 0.017 to 0.030 per team per game) when accounting for the number of pitches that we expect to change from a ball to a strike here (between 550 and 970 per year total). Keep in mind this is a very rough estimate, and does not account for changing behavior of pitchers or batters. But that's about a factor of 10 smaller than the estimate found with the stringer data (and the one that Baseball Analytics reports). So we should be wary about these numbers.

So, we can see here that one really needs to use a consistent top and bottom of the strike zone to ensure we don't see weird changes like this due to something other than changes in umpire behavior. That's not to say that there hasn't been any change (and I suspect that younger umpires are better than their older counterparts partly due to the extensive technological monitoring and performance training they must go through) but the 5 percentage point increase in strikes within the zone is well above what that change really is.

Another criticism of this is only calculating changes in strike calling on pitches within the zone. If umpires also decrease the number of strikes called on pitches outside the zone, then the net change in run scoring could be zero. Rather than calling more strikes, umpires could simply be getting better at their jobs. If anything, the latter is the most likely choice, given the data I have. So let's take a look at the fixed zone strike percentage on pitches that are outside of the rulebook zone:

Fixed Zone:
2008: 11.84%
2009: 11.51%
2010: 11.39%

Aha! So any increase we see on in-zone strikes tends to be cancelled out by a decrease in out of zone strikes. This change amounts to about 0.076 runs per game total (or about 0.038 per team per game) since there are more than twice as many pitches outside the zone than within that are called by the umpire.

That's interesting. But this doesn't explain everything. There could be a decrease in the quality of pitches outside the zone over time. And if pitchers now that they'll get more strikes well outside the zone, they may try to nibble way out there more often. We'll see that this might be going on a bit later. But note that using a discrete measurement also may be an issue. Not every pitch within the rulebook strike zone is created equal, nor is every pitch outside of it. There are varying degrees of strike likelihood depending on how close the pitch is to the edge of the zone.

To be clear, I think there is still something going on. I don't believe it is anywhere near the effect size reported at Baseball Analytics. But there is certainly plenty of good reason to think monitoring and training using these advanced technologies is improving umpire performance. In fact, I believe we see this training taking hold for those younger umpires coming up through the ranks. This is, of course, a VERY interesting effect. But it almost certainly does not account for any huge change in run scoring.

So I figured I would go a little deeper. Below I have mapped out and measured the size of the strike zone across 2008, 2009 and 2010. These are maps I have used before, employing cross-validated smoothing parameters in order not to overfit. The contours tell you the boundary at which--within that contour--pitches are called strikes at least 50% of the time. In the tables, I map out the area of each contour, as well as the area within each contour. This will provide some evidence as to WHERE the umpires are changing their strike calling behavior, if they are at all.


As you can see, there are some slight changes. Right handed batters do seem to be getting more low strikes called against them. However, the net change is nearly zero once we account for the fewer strikes being called on the outside of the plate. We see a smaller change for lefties low and outside; however, it seems that the inside strike is being called less often against lefties.

But above I only plot the 50% contour, and the size is rather ambiguous. Luckily, we can actually measure the size of this zone in R (HOORAY R!). You have seen Josh Weinstock do this in the past, so the measurement is nothing new. We can do this at all the different contours to see if the umpire is changing on pitches down the middle, just those on the edges, or perhaps those on the well-outside edge of the previous zone. Below I have the 30% through 90% contour zone sizes from 2008 to 2010, and the change in each from year to year.

RHB RHB RHB RHB RHB RHB RHB RHB RHB RHB
Zone Size (sq. in.)
10%

20%

30%

40%

50%

60%

70%

80%

90%
2008 672.77 586.56 530.90 485.95 445.19 404.87 361.54 309.86 235.65
2009 670.73 582.44 525.40 479.47 438.05 397.45 354.37 303.80 232.76
2010 681.09 594.13 535.39 487.17 443.46 400.94 356.69 306.37 238.80

Changes

10%

20%

30%

40%

50%

60%

70%

80%

90%
2008 to 2009 -2.04 -4.12 -5.50 -6.48 -7.14 -7.42 -7.17 -6.06 -2.89
2009 to 2010 10.36 11.69 9.99 7.70 5.41 3.49 2.32 2.57 6.04
2008 to 2010 8.32 7.57 4.49 1.22 -1.73 -3.93 -4.85 -3.49 3.15

Percent Changes

10%

20%

30%

40%

50%

60%

70%

80%

90%
2008 to 2009 -0.30% -0.70% -1.04% -1.33% -1.60% -1.83% -1.98% -1.96% -1.23%
2009 to 2010 1.54% 2.01% 1.90% 1.61% 1.24% 0.88% 0.65% 0.85% 2.59%
2008 to 2010 1.24% 1.29% 0.85% 0.25% -0.39% -0.97% -1.34% -1.13% 1.34%












LHB LHB LHB LHB LHB LHB LHB LHB LHB LHB










Zone Size (sq. in.) 10% 20% 30% 40% 50% 60% 70% 80% 90%
2008 658.93 575.93 521.70 477.83 437.93 398.50 356.20 306.04 235.00
2009 657.55 575.06 521.14 477.31 437.35 397.69 355.00 304.15 232.16
2010 675.05 587.10 527.48 478.58 434.54 392.08 348.32 299.01 233.88

Changes

10%

20%

30%

40%

50%

60%

70%

80%

90%
2008 to 2009 -1.38 -0.87 -0.56 -0.52 -0.58 -0.81 -1.20 -1.89 -2.84
2009 to 2010 17.50 12.04 6.34 1.27 -2.81 -5.61 -6.68 -5.14 1.72
2008 to 2010 16.12 11.17 5.78 0.75 -3.39 -6.42 -7.88 -7.03 -1.12

Percent Changes

10%

20%

30%

40%

50%

60%

70%

80%

90%
2008 to 2009 -0.21% -0.15% -0.11% -0.11% -0.13% -0.20% -0.34% -0.62% -1.21%
2009 to 2010 2.66% 2.09% 1.22% 0.27% -0.64% -1.41% -1.88% -1.69% 0.74%
2008 to 2010 2.45% 1.94% 1.11% 0.16% -0.77% -1.61% -2.21% -2.30% -0.48%


You can see that, overall, the zone size isn't changing at the huge rate suggested at Baseball Analytics. In fact, within the defined zone the strike rate is somewhat decreasing. What is very interesting, though, is that umpires DO tend to be calling more pitches well outside the strike zone, with an increase of about 1% to 1.5% in the size of the outer edges of the zone. This is not a trivial change, and the majority of it DOES seem to be coming from the low strike, as you can see below. The strange thing is that this seems to contradict the in-and-out-of-zone numbers cited early on in this post. That likely means there's something going on in between these contours.


One thing to remember is that these contours do have confidence intervals. And it could be that the CI's get larger as we get further out toward the 50%, 40%, 30%, 20% and 10% contours, since this is where the most variation comes in across umpires. Therefore, the differences we see should be attenuated somewhat to account for this uncertainty. I haven't plotted the CI's here because they'll just make things confusing to look at. But you must remember that these contour lines are not the end all of the conversation on how the zone is being called. I think my method gives the best estimate that we can really get from this type of data, but that doesn't mean it is anywhere near perfect.

Also, keep in mind I have not done this by count or pitch type, so everything is pooled together. So, if more of a certain type of pitch is being thrown, this could affect our results (and ultimately mean that umpires aren't changing their behavior, it is the pitchers who are throwing more of certain types of pitches that umpires are more likely to call strikes outside the zone). Or, if umpires are changing behavior differently in different counts, we won't pick this up and it means they are changing their behavior in some way that creates only a very small net effect when averaging across all of them. Certainly that would be the next step in the analysis, but I just don't have the time right now. And once we start cutting up the data into smaller sample sizes, there are issues in the reliability and comparability of zone measurements across these very different sample sizes.

But we can see if pitchers tend to throw more to this area by taking the percentage of pitches thrown between each contour (i.e. between the 10% and 20% contour, between the 20% and 30% contour, etc.). Doing this, we'll want to use the 2008 strike zone to predict the probability of our 2010 pitches. That way, the contours are comparable. Additionally, we want to apply this model to ALL pitches thrown, not just those called by the umpire. This is because there could be changes in contact and swing rates, and we're just interested in the pitchers' behavior.

But, if pitchers in 2010 are throwing more to the 2008 10% and 20% contours, then we may be able to say that pitchers are trending toward throwing to these areas where the umpire seems to be expanding the zone some bit. Below, I have the changes across time for the areas that pitchers are throwing to:

RHB RHB RHB RHB RHB RHB RHB RHB RHB RHB

Pitches Seen in Area
10-20% 20-30% 30-40% 40-50% 50-60% 60-70% 70-80% 80-90% > 90%
2008 18,674 13,458 11,880 11,408 12,179 13,972 18,092 28,819 115,688
2009 18,673 13,379 11,746 11,482 11,745 13,728 17,926 28,864 116,928
2010 19,473 13,800 12,139 11,841 12,677 14,326 18,689 30,137 120,643

Rate
Thrown To


10-20%


20-30%


30-40%


40-50%


50-60%


60-70%


70-80%


80-90%


> 90%
2008 7.65% 5.51% 4.87% 4.67% 4.99% 5.72% 7.41% 11.80% 47.38%
2009 7.64% 5.47% 4.80% 4.70% 4.80% 5.62% 7.33% 11.81% 47.83%
2010 7.67% 5.44% 4.78% 4.67% 5.00% 5.65% 7.37% 11.88% 47.55%

Change

10-20%

20-30%

30-40%

40-50%

50-60%

60-70%

70-80%

80-90%

> 90%
2008 to 2009 -0.0001 -0.0004 -0.0006 0.0002 -0.0018 -0.0011 -0.0008 0.0000 0.0045
2009 to 2010 0.0004 -0.0003 -0.0002 -0.0003 0.0019 0.0003 0.0003 0.0007 -0.0028
2008 to 2010 0.0003 -0.0007 -0.0008 -0.0001 0.0001 -0.0008 -0.0004 0.0007 0.0017

Percent Change

10-20%

20-30%

30-40%

40-50%

50-60%

60-70%

70-80%

80-90%

> 90%
2008 0.13% 0.71% 1.25% -0.52% 3.68% 1.87% 1.04% -0.03% -0.95%
2009 -0.48% 0.62% 0.42% 0.63% -4.00% -0.55% -0.45% -0.60% 0.59%
2010 -0.35% 1.32% 1.67% 0.11% -0.17% 1.33% 0.59% -0.64% -0.36%










LHB LHB LHB LHB LHB LHB LHB LHB LHB LHB

Pitches Seen in Area
10-20% 20-30% 30-40% 40-50% 50-60% 60-70% 70-80% 80-90% > 90%
2008 14,073 10,126 8,883 8,645 9,254 10,250 13,191 20,763 86,135
2009 14,909 10,700 9,459 9,348 9,645 11,092 14,090 21,974 91,871
2010 14,291 10,218 8,970 8,612 9,174 10,486 13,464 20,813 86,686

Rate Thrown To

10-20%

20-30%

30-40%

40-50%

50-60%

60-70%

70-80%

80-90%

> 90%
2008 7.76% 5.58% 4.90% 4.77% 5.10% 5.65% 7.27% 11.45% 47.50%
2009 7.72% 5.54% 4.90% 4.84% 5.00% 5.74% 7.30% 11.38% 47.58%
2010 7.82% 5.59% 4.91% 4.71% 5.02% 5.74% 7.37% 11.39% 47.44%

Change

10-20%

20-30%

30-40%

40-50%

50-60%

60-70%

70-80%

80-90%

> 90%
2008 to 2009 -0.0004 -0.0004 0.0000 0.0007 -0.0011 0.0009 0.0002 -0.0007 0.0008
2009 to 2010 0.0010 0.0005 0.0001 -0.0013 0.0003 -0.0001 0.0007 0.0001 -0.0014
2008 to 2010 0.0006 0.0001 0.0001 -0.0005 -0.0008 0.0009 0.0009 -0.0006 -0.0006

Percent Change

10-20%

20-30%

30-40%

40-50%

50-60%

60-70%

70-80%

80-90%

> 90%
2008 to 2009 0.52% 0.77% 0.01% -1.54% 2.13% -1.62% -0.31% 0.62% -0.16%
2009 to 2010 -1.30% -0.92% -0.21% 2.64% -0.52% 0.10% -0.98% -0.09% 0.29%
2008 to 2010 -0.77% -0.14% -0.21% 1.14% 1.62% -1.52% -1.29% 0.52% 0.13%


All in all, the table above seems rather ambiguous, but I'd be interested in hearing any patterns that others see here. RHB have been seeing more pitches at the 20% to 40% contours, while LHB have been seeing more at the 40% to 60% contours. Of course we would also have to understand if batters s are changing their behavior that affects run scoring in a more significant way than we would expect due to the lower strikes outside the zone or any of the changes in the table above. But given that we're talking about so few pitches in terms of overall ball-to-strike changes (or vice versa), I am going to be cautious about making any large statements about the effects of this on the run-scoring environment.

Lastly, what I think could also be going on is these younger umpires coming into the league. Umpires are being trained more and more using pitch f/x type technology and use it for learning missed calls, problems in their own strike zone, and so on. They get reports for their games from the umpire's association as they come up through the minor leagues. I think this should have a real effect on strike calling behavior, likely meaning that younger umps call strike closer to the rulebook zone. As a rough comparison, below I have Mike Estabrook (a younger umpire) and his zone compared to, say, Jerry Crawford. Now, this is only two umpires and a small sample, but there could be something to the idea that a younger umpire like Estabrook is willing to form his zone to the rulebook, as opposed to Crawford who essentially has tenure as an MLB umpire. Keep in mind that the below is only two umpires, and we'll probably need a few more years of data to detect any changes for younger umpires vs. older ones.





So that's all I have for today. I have spent way too much time on this, but please provide comments if you have thoughts, criticisms, or flat out think I'm an idiot. There is certainly more to say than what I have here.

NOTE: At the suggestion of Tango, I provided some additional information below regarding the size of the strike zones which I have used in a recent academic paper I have under review. In this case--rather than use square inches--I report the approximate number of baseballs that could fit side-by-side through the given area if it were, say, an actual square rather than strips of changes around the zone (a baseball is about 8.4-8.5 square inches, or a 2.9-by-2.9 inch area, the diameter of the ball). So, for example, 81 baseballs means a square 9 baseballs by 9 baseballs. Hopefully, this will help to visualize the size changes in the zone in a context relevant to the discussion, as I think Tango makes a very good point.

RHB RHB RHB RHB RHB RHB RHB RHB RHB RHB
Zone Size (in baseballs) 10% 20% 30% 40% 50% 60% 70% 80% 90%
2008 80.00 69.75 63.13 57.78 52.94 48.14 42.99 36.84 28.02
2009 79.75 69.26 62.47 57.01 52.09 47.26 42.14 36.12 27.68
2010 80.99 70.65 63.66 57.93 52.73 47.67 42.41 36.43 28.39

Changes (in baseballs)

10%

20%

30%

40%

50%

60%

70%

80%

90%
2008 to 2009 -0.24 -0.49 -0.65 -0.77 -0.85 -0.88 -0.85 -0.72 -0.34
2009 to 2010 1.23 1.39 1.19 0.92 0.64 0.41 0.28 0.31 0.72
2008 to 2010 0.99 0.90 0.53 0.15 -0.21 -0.47 -0.58 -0.41 0.37


LHB LHB LHB LHB LHB LHB LHB LHB LHB LHB
Zone Size (in baseballs) 10% 20% 30% 40% 50% 60% 70% 80% 90%
2008 78.35 68.48 62.03 56.82 52.07 47.38 42.35 36.39 27.94
2009 78.19 68.38 61.97 56.76 52.00 47.29 42.21 36.17 27.61
2010 80.27 69.81 62.72 56.91 51.67 46.62 41.42 35.55 27.81

Changes (in baseballs)

10%

20%

30%

40%

50%

60%

70%

80%

90%
2008 to 2009 -0.16 -0.10 -0.07 -0.06 -0.07 -0.10 -0.14 -0.22 -0.34
2009 to 2010 2.08 1.43 0.75 0.15 -0.33 -0.67 -0.79 -0.61 0.20
2008 to 2010 1.92 1.33 0.69 0.09 -0.40 -0.76 -0.94 -0.84 -0.13

7 comments:

  1. Millsy, it is not a big deal, but you just happened to pick the umpire with the smallest strike zone in major league baseball over the years, regardless of age or experience, in Jerry Crawford.

    Otherwise, great work, although I must confess that I don't understand much of it, but that's just me!

    MGL

    ReplyDelete
  2. True, I did. I kind of whipped that in there toward the end to show the largest discrepancy I could find because I knew the tenure of those two guys much better than I do the rest of the umpires.

    So there is bias in that presentation on my end. But my thoughts come from the database I had posted a while back (I did not do any serious analysis on ump tenure and correct percentage, it was more of a glance over of the data).

    ReplyDelete
  3. Cool article Millsy,

    If you get a chance, it'd be awesome if you did an R-post on how to calculate contour areas.

    ReplyDelete
  4. Anonymous,

    While I do it slightly differently, Josh Weinstock has a tutorial on that here:

    http://pitchrx.blogspot.com/2012/01/calculate-umpire-strikezone-sizes.html

    ReplyDelete
  5. how far do these strike zone records go back? I would expect a significant difference if you compare current zone to the zone in the mid-90's (at least american league anyway) when I remember they were only calling strikes from the tops of the knees to the bottom of the belt. i remember succinctly because it really annoyed me as a kid watching the games...of course I may be wrong, but is such an analysis possible?

    ReplyDelete
  6. Hi astrobassist,

    Unfortunately, the locational records only go back to 2007 (and only part of the season at that). There is proprietary Questec data dating back to 2004, I think. However, that is not released.

    I suspect you'd be right that there have been significant changes in called strike zones over longer periods.

    ReplyDelete