Recently, I've been working a lot with umpire data. A lot of this has to do with the nice big sample sizes that it provides for most umpires, which just makes it easier to infer interesting things fr
om the data. Today I thought I would check out something I came across when looking for B-Pro topics over the past couple weeks.
It is relatively well-known that umpires and catchers do their best at keeping a good rapport with one another behind the plate. Catchers need to be diplomatic when asking about a call on a given pitch, as umpires may not take well to being called out for an incorrect call. The implications of these interactions very well may be important for the catchers’ battery mate standing 60 feet 6 inches away, and the hope is that—if the catcher has any effect on borderline calls—it will be a positive one for his pitcher. This is a difficult thing to measure, so I’ll have to leave this for someone else who has access to players and can survey them.
No, today I’ll be asking a related question, though from a different angle: Do umpires give catchers the benefit of the doubt when they’re at the plate? If catchers are being cordial with the umpire behind the plate, then this could be a result of both team-level and individual-level incentives. If the catcher knows that being nice will improve his experience at bat, then he may have a strong incentive to be nice.
But there’s a cognitive aspect to this from the umpire perspective as well. If an umpire screws up a call for a catcher, then he has to face the guy in a couple of outs right there behind the plate. The rest of the team heads off at least 60 feet away, where the umpire won’t be close enough to hold their hand. Think about how you might talk to someone you can’t stand on the internet. Then think about whether you would say those things to their face. Are the two interactions different? Could this cognitive aspect come into play when making ball-strike calls when catchers are up to bat?
To answer this question, I’ll use Pitch F/X data from 2008 through 2010. I include a dummy variable for whether or not the player is a regular catcher, as well as variables giving us information about the distance that the pitch is from the center of the strike zone, whether or not the umpire made the ‘correct call’ (based on a strike zone covering the width of the plate and within a height of 1.75 ft. and 3.45 ft.), and a few other controls. Using a few different types of analysis, I’ll do my best to tackle the data.
My first step was to run a simple logistic regression on the calls made by the umpire against all batters during this time period. In other words, I’ll be predicting the effect of my variables on the probability that a certain pitch is called a strike, holding constant the location, count, batter/pitcher handedness, inning, and so on. The dependent variable here is whether or not the pitch was called a strike (1=called strike, 0=not called strike), and the data include only calls made by the umpire.
One thing to consider, however, is that catchers could be coming in at a bit of a shorter height than the rest of the batters in the sample. Because I used a fixed top and bottom of the strike zone, those pitches at the edges of the top and bottom could be biasing the catcher effect. The average MLB roster is about 6’ 1” or 6’ 2” (http://espn.go.com/mlb/stats/rosters/_/sort/null/order/false), while the average catcher height came in at just under 6’ 1” last season (http://www.answerbag.com/q_view/2021290). If we assume the difference is about 1 inch between catchers and the rest of the league, we should probably account for this. Within the data set, for a given pitch, the difference in the strike zone of catchers vs. non-catchers is roughly less than half a vertical inch. For the purposes of this preliminary look, I proxy batter height using the listed top and bottom of the zone within the Pitch F/X data. While the provided numbers are extremely noisy and problematic for choosing whether or not a pitch was “within the zone”, they should work well as a rough proxy for the height of the batter on average. This likely won't control enough for height, but running the regression with and without the top and bottom of the zone variables does not really change anything with the catcher variable at all. This could mean one of two things: 1) Height is not an issue or 2) The sz_top and sz_bot variables aren't just noisy, but completely worthless (a very real possibility).
One other thing to check is whether or not catchers just see more pitches within the strike zone defined earlier in this article. It turns out that there is not a statistically significant difference in the number of pitches seen within the fixed zone for catcher as for other players. This also gives us some slight evidence that the height of catchers isn’t too much of a problem in the model. If catchers are significantly shorter than the rest of the population, we would expect that pitchers would adjust themselves to throw pitches within this smaller zone. However, the spray of pitch locations is pretty much the same for catchers and non-catchers. For brevity, I do not include the results of this regression (though, they can be had upon request).
Below is the output from the logistic regression on the probability of a strike call:
Variable: | Estimate | Std. Error | z-value | Sig. |
(Intercept) | 6.115965 | 0.061387 | 99.629 | *** |
count.0.0 | 0.501851 | 0.025818 | 19.438 | *** |
count.0.1 | -0.085477 | 0.027324 | -3.128 | ** |
count.0.2 | -0.466645 | 0.034037 | -13.71 | *** |
count.1.0 | 0.685788 | 0.026874 | 25.519 | *** |
count.1.1 | 0.146811 | 0.027805 | 5.28 | *** |
count.1.2 | -0.192775 | 0.030828 | -6.253 | *** |
count.2.0 | 0.877375 | 0.02952 | 29.722 | *** |
count.2.1 | 0.351544 | 0.029845 | 11.779 | *** |
count.2.2 | -0.010799 | 0.031409 | -0.344 | 0.731 |
count.3.0 | 1.068622 | 0.033826 | 31.592 | *** |
count.3.1 | 0.557583 | 0.03356 | 16.614 | *** |
count.3.2 | Base-level |
|
|
|
factor(end_outs)=1 | 0.516444 | 0.011563 | 44.664 | *** |
factor(end_outs)=2 | 0.475019 | 0.011598 | 40.956 | *** |
factor(end_outs)=3 | 0.815832 | 0.012411 | 65.735 | *** |
factor(pitcher_throws)=R | 0.150937 | 0.01312 | 11.505 | *** |
factor(batter_stand)=R | -0.259115 | 0.01416 | -18.299 | *** |
linear_distance_from_centerpoint | -7.177881 | 0.01522 | -471.622 | *** |
catcher | -0.122086 | 0.011084 | -11.015 | *** |
pitcher_throws=R & batter_stand=R | -0.121216 | 0.016309 | -7.432 | *** |
The effect in the regression for the ‘catcher’ dummy variable is statistically significant and larger than I would have expected (some of this could be coming from differences in height, despite my attempts at controlling this variable). On average, a pitch that is at the edge of the zone (normally a 50-50 change of being called a strike) is about 12.5% less likely to be called a strike if the batter is a catcher. For those unfamiliar with logistic regression, I won’t go into explaining how this changes as the probability of a strike call otherwise increases or decreases. The estimated effects of coefficients in logistic regression can't just be read off the regression table when pitches are closer to a 1 or 0 probability. So with catchers, it's likely that a pitch down the middle is still a strike very near the same rate, while a pitch 10 feet high is still a ball at very near the same rate as other batters.
This is a pretty interesting contrast, but there could be a one other thing confounding the result: I have not controlled for the talent of the batter. We know that catchers are generally not as adept at hitting the ball as those at other positions, if for no other reason than the top hitting catchers are often moved to another position early on. If umpires are ‘compassionate’ toward players who just aren’t very good hitters, then we could be picking up this effect here. I don’t currently have an answer to this issue, as I do not have individual player performance in my Pitch F/X data at this point. If it is the case that this is an effect of the umpire interacting with the batter’s skill, then it is also an interesting issue to be looked into later on that likely needs to be controlled for in the data. I am in the process of greatly improving the information in my Pitch F/X database, so hopefully I can take a look at this stuff as well.
I took the first model a bit further and followed a technique that J-Doug has used in his fantastic ‘Compassionate Umpire’ articles. For this second model, I used an indicator variable of whether the strike was in the batter’s favor (ball within the zone, called a ball), the pitcher’s favor (call outside the zone, called a strike), or neutral (“correct call”). This is the dependent variable in an ordered logistic model. As the indicator increases (from -1 to 0 to 1), the calls are coming more “in favor” of the batter. This sheds some further light on any increase in probability that the ball will be in the batters’ favor, given that he is a catcher. I again don’t present the full regression output as they simply confirm the earlier finding; however, this model also indicated a significant increase in favorable calls for catchers.
So, where is this difference in strike-calling coming from? Well, looking at the contour for the 50% call rate for left handed and right handed batters, we can see below. In the panel on the left, I plotted the 50% contours for RHB that are catchers and non-catchers, while on the right panel, we have left-handed batters (plots are from the umpire’s view). In both panes, you can see that umpires are a bit more lenient with inside pitches for both right and left-handed batting catchers. Right-handed catchers seem to get calls in their favor up in the zone, but this very well could be a result of these catchers being a little shorter than their non-catching counterparts. You can see that for left-handed catchers, the zone is shifted upward a bit. So the ‘height’ factor seems to be relatively ambiguous compared to the inside corner difference, especially considering that the lower limit of the zone for RHB is almost identical for both groups of batters. A word of caution: comparing differences this small on plots like this is not a replacement for more rigorous analysis, but they are interesting to look at once we understand some of what is going on in the data.

This is of course not certain evidence that there is something going on with the catchers at bat, but it seems to point to something interesting. I’d like to look into this phenomenon (if that is what it is) and be a bit more confident about the height of the batters and the possibility of umpires being ‘compassionate’ toward less skilled hitters, rather than catchers themselves. If anyone has batter height available by player_id (the ones included in the Pitch F/X database format from Mike Fast's tutorial), I'd love to be able to include this in my data. That way, I could try and provide a bit more accurate umpire performance estimations as well.
In the end, it very well could be that the closeness of the catcher and the umpire has an effect on the umpire taking the bat out of a catcher’s hands. But replication and improvement is always key in this sort of analysis, and I think it is needed here. I’d love to hear some reactions to the analysis, and am always willing to hear shortcomings of the approach here. I have a hard time coming out and proclaiming a definite bias without better controlling for the height of the batters, but the effect seems to be large enough that at least some of it is real.