Sunday, January 18, 2009
Projecting the 2009 Yankee Defense Using Fangraph’s UZR
The more statistically-inclined of our blog readers are probably aware of this, but Fangraphs has added Ultimate Zone Rating, b/k/a UZR to their statistical reports. What’s UZR? Here’s an article describing it, by its creator, Mitchel Lichtman, b/k/a as MGL in the online baseball statdork world.
Although the engine behind UZR is the same, Fangraphs uses different input data than MGL, which causes some variance in the numbers in their database compared to numbers you may have seen posted elsewhere. This could be as simple as a difference in how scorers judge balls in play, which points to one of the limitations of our current defensive metrics. This has been discussed on this thread on The Book blog so I won’t get into that here.
Still, I feel that UZR is a sound system, and while I will still use standard zone rating as at least part of any defensive analysis I do, I would like to incorporate UZR as well, as long as it passes the sniff test.
So what better way to run a sniff test than look at the last few years of data for the 2009 Yankees and see what it says?
UZR doesn’t include catchers, so I’ll skip them for now. I’m going to use a weighted average of last four years of data, with some regression towards the mean included.
First, here’s the list of all the Yankees 2005-2008 UZRs and their 2009 projections(using a 4/3/2/1 weight and adding in 150 league average games) to regress towards the mean. Obviously, sample size is a concern for some players, like Brett Gardner, so take that into account.
DG: Defensive games.
exO: Expected outs. The number of outs plus reached base errors that would be made by an average fielder given the distribution of balls in play while that fielder was on the field.
RngR: Range runs. The number of runs above or below average a fielder is, determined by how the fielder is able to get to balls hit in his vicinity.
ErrR: Error runs. The number of runs above or below average a fielder is, determined by the number of errors he makes as compared to an average fielder at that position given the same distribution of balls in play.
UZR: Ultimate zone rating. The number of runs above or below average a fielder is in both range runs and error runs combined.
UZR/150: UZR pro-rated to 150 games.
And here’s a rough stab at how the starters project over a full season.
The outfield looks like it’ll be mix and match, so I tried to account for that somewhat, but obviously, there will be bench time in here that I haven’t accounted for, both in the infield and the outfield. If the Yankees go for an offensive middle infielder, they’ll probably be a few runs worse than this. If they go for a glove guy, they shouldn’t change much.
Still, -14 isn’t bad. Last year’s team was -39 according to these same statistics, although more than half of that was Bobby Abreu’s -25.
And there’s a pretty major math issue with the numbers above. Anyone with fewer than four seasons at a position is wrong. Update after the jump.
So yeah, like I said, I messed up the numbers for players with fewer than four seasons at a position. Let this be a warning, stay away from spreadsheets after midnight. Here’s what the numbers should look like.
And what that means…
Infield doesn’t really change, but the OF gets much better. Still, we have sample size issues with just about everyone out there, so take that into account. But it is entirely possible the Yankee defense will be average or (gasp) above average this year.
Page 1 of 1 pages: