Wednesday, January 6, 2010
2009 vs. 2008 Yankees by Position (Defense Edition)
I'm using the same idea as in yesterday's post, just looking at the total run saved compare to average by position instead of by individuals. I don't have my catcher defense spreadsheet handy so for now that's not included.| YEAR | TM | POS | zRS | uRS | aRS | Diff |
| 2008 | NYY | 1B | -13 | -9 | -11 | |
| 2009 | NYY | 1B | 9 | -3 | 3 | 14 |
| 2008 | NYY | 2B | -5 | -7 | -6 | |
| 2009 | NYY | 2B | -4 | -5 | -4 | 2 |
| 2008 | NYY | 3B | -7 | -9 | -8 | |
| 2009 | NYY | 3B | -13 | -15 | -14 | -6 |
| 2008 | NYY | CF | 10 | 2 | 6 | |
| 2009 | NYY | CF | 3 | 9 | 6 | 0 |
| 2008 | NYY | LF | -11 | 7 | -2 | |
| 2009 | NYY | LF | -3 | -12 | -7 | -5 |
| 2008 | NYY | RF | -22 | -24 | -23 | |
| 2009 | NYY | RF | -5 | -3 | -4 | 19 |
| 2008 | NYY | SS | 2 | 0 | 1 | |
| 2009 | NYY | SS | -1 | 7 | 3 | 2 |
| 2008 | NYY | Total | -46 | -40 | -43 | |
| 2009 | NYY | Total | -15 | -21 | -18 | 25 |
zRS: Defensive runs saved above average using zone rating
uRS: Defensive runs saved above average using UZR
aRS: Average of zRS and uRS
Diff: 2009 aRS minus 2008 aRS. A positive number indicates an improvement at the position, a negative number indicates a decline.
The 3B decline is in large part due to Messrs. Ransom and Berroa (Combined - 6 in zone rating, and -7 in UZR). Otherwise, every position but Yankee LF rated better defensively in 2009 than 2008.
Comments
Well considering we were talking about defense anyway maybe it makes more sense to move that other conversation over here…
78. Posted at 10:43:27 am on Wednesday, January 6, 2010 by MC in VA
My general impression is that defensive projections come with bigger error bars, so yeah I’d say that building a defense first team entails more risk. OTOH, even the best hitters can have an off-year now and then, so who knows?
It’s not just error bars in projections that I’m talking about. The numbers used to calculate the value of offense have been kept track of for close to 100 years so there is a great deal of history behind them. You even new statistics you can run through a large history and see how well they correlate to winning.
It’s my understanding that the numbers used to calculate UZR have been around for less than 10 and depend more on value judgments from individuals and don’t take into account things like defensive positions which can have a real effect on the numbers. Furthermore positioning can be more of a product of a certain teams scouting reports and preparation. I mean if there is that short of a history and they are still working out kinks on things like that (and adjusting for the odd left field in Fenway) wouldn’t the numbers themselves be less reliable?
It’s not just error bars in projections that I’m talking about.
wouldn’t the numbers themselves be less reliable?
Isn’t this the same thing? It’s not just the small mathematically determined confidence interval, but it’s a further decrease in the confidence interval because of limitations in the metric.
Isn’t this the same thing?
No. For instance, projecting certain stats like BA with RISP is hard because because there is a lot of variance year to year. It’s not seen as a repeatable, consistent skill. The error bars in projecting a stat like that would be quite large. That’s why people don’t do it.
However given the measured offensive stats you can come up with a tighter confidence interval as to how much value those numbers would typically have in that general situation.
From the other thread.
UZR is a tool. It should be used in conjunction with other tools. For example, if scouts say a player is an elite defender, fans (e.g. Tom Tango’s Fans Scouting) say he is an elite defender, and UZR says he is +45 over a 3 year period, you’re pretty sure he is elite.
Yes you can get an idea he is elite but can you say with confidence that he is a +15 runs player defensively? Can you assign that run value to his defensive contribution and be as confident as you are in similar run values assigned to his offensive contribution?
However given the measured offensive stats you can come up with a tighter confidence interval as to how much value those numbers would typically have in that general situation.
You’re misunderstanding what I’m saying.
We get a confidence interval from the actual data. And then we change it when we factor it in to how we feel about a player because we know the metric has it’s limitations.
We agree on the limitation of the defensive data compared to the offensive data, which results in error bars arrived at determinstically.
You said: “I mean if there is that short of a history and they are still working out kinks on things like that (and adjusting for the odd left field in Fenway) wouldn’t the numbers themselves be less reliable?”
To which I replied: “It’s not just the small mathematically determined confidence interval, but it’s a further decrease in the confidence interval because of limitations in the metric. ”
So the error bar changes again, because we know the metric is in it’s relative infancy.
In addition to having a crappy confidence interval on your data based just on the number themselves, it gets even worse because you question the data collection process in the first place.
Can you assign that run value to his defensive contribution and be as confident as you are in similar run values assigned to his offensive contribution?
Sort of. It’s easy to measure the value of an extra out, but it is hard to figure out what other fielders would have done (ie, find the league average). Which is why hopefully hit-fx will replace all of the icky parts in UZR.
Sort of. It’s easy to measure the value of an extra out, but it is hard to figure out what other fielders would have done (ie, find the league average). Which is why hopefully hit-fx will replace all of the icky parts in UZR.
Eventually. Depending on how it works out we may eventually get enough data to say that a ball hit this hard, with this trajectory, in this direction, that landed in this zone, in this amount of time after making contact with the bat is on average caught. Then you can take those numbers and compare them to the actual run prevention numbers and see if there is a correlation. Also see how much variation there is from year to year for players.
But with the limited amount of data (I think HIT f/x just became available in 2009 right) how long is it going to take to get enough data to close the gap between the confidence intervals on those numbers so that they are reasonably comparable to the offensive numbers? And more importantly, that gap is not comparable right now so there is risk in doing a Bay for Cameron swap?
Yes you can get an idea he is elite but can you say with confidence that he is a +15 runs player defensively?
I had originally written up a “player A|player B” comparison but abandoned it. I would say that over 3 years of data, you are fairly confident that - over the past 3 years - he was +15, +/-5 runs.
Can you assign that run value to his defensive contribution and be as confident as you are in similar run values assigned to his offensive contribution?
No, not as confident. But that’s more a part of risk-management than anything, right? You take your *best guess*, which is backed up by data. At this point the best data is probably UZR. You then have to let the person using the data know the risk. E.g. “Cameron looks to be about 20 runs better than Bay on defense, but we could be off by as much as 10 runs”. Then the GM (probably the person using the data) needs to determine based on that level of risk, how much offense he is willing to give up, how much he’s willing to pay, etc.
Now, for us, I have no problem using UZR to discuss players. It’s nice to open a conversation, and can lead to someone having real data. E.g. Fabian may have 3 years of scouting reports on Gardner in the minors to help confirm or refute the UZR, or maybe show a path of improvement. It can be nice to start with questions like, “what would it look like if you flipped Gardner and Granderson?”, or, “how much does Cameron improve Boston over Bay?”. I think it is a great starting point for a conversation. We just have to realize that just because Cashman may not want to shift Granderson to left, doesn’t mean he’s ignorant of good fielding, but that other data may indicate Granderson is as good or better than Gardner.
I probably come off sometimes as being 100% confident in UZR; I’m 100% confident it is the best system we currently have. I’m also 100% confident it can be improved, and 100% confident it is not the final say. I’m often defending it, because those who aren’t fans come off as (my impression), “this stat isn’t very useful, don’t use it”. Which I think is a huge mistake.
The people who knock Swish’s defense probably have forgotten how bad Abreu was.
Also see how much variation there is from year to year for players.
yeah, hit f/x will hopefully let us tell the difference between fluctuations in UZR caused by talent change and fluctuations caused by an easy/hard ball-in-play distribution.
Not that I did any analysis, but the BIP distribution for a fielder is probably more important than BABIP for a hitter, since there is no fielding equivalent to working a walk or hitting a home run. So in terms of expected value, fielding could always be less predictable than hitting.
I probably come off sometimes as being 100% confident in UZR; I’m 100% confident it is the best system we currently have. I’m also 100% confident it can be improved, and 100% confident it is not the final say. I’m often defending it, because those who aren’t fans come off as (my impression), “this stat isn’t very useful, don’t use it”. Which I think is a huge mistake.
I’d agree. I’m not trying to say it has no value at all. I’m just trying to call into question the exact run value it assigns.
For instance, I’d have no problem giving the Gold Glove to Franklin Gutierrez but I’m wouldn’t be handing him a $20M AAV free agent contract or giving him top 10 MVP votes or treating him as equal to a player like Matt Kemp.
Well, Gutierrez just got a four year deal for $20M total, so you needn’t worry about that one at least.
yeah, hit f/x will hopefully let us tell the difference between fluctuations in UZR caused by talent change and fluctuations caused by an easy/hard ball-in-play distribution.
It seems like one way to do this now would be to take data on how off the players were to their projections and plot it as a function of specific pitchers and managers (ie a manage/pitcher is consistently being given players with projected performance X and they consistently end up with Y*X (Y<1, Y>1)). Remove chaff by leaving out players with suspicion of an underlying injury or those that had a change in scenery (ballpark) and see what falls out. Maybe nothing, but maybe something.
Andre Dawson the only one elected. WTF?
[14] Really? I can’t find a link. Got one?
The HoF voters are a joke.
Here’s the link.
Absent any sensible argument why the estimates of x made by smart people are systematically low, or scaled wrong, or whatever, the procedure recommended by science is to use x at face value and weight it appropriately by the associated estimated uncertainty when adding to other measurements, esp. if the effect is larger than the uncertainty. And in the case of individual data points one can of course say, this seems exceptional compared to historical results or whatever. But one can’t reasonably a priori say of a metric made by someone careful, “the uncertainty is large therefore I’m going to assume the central value is too high”.
It’s not even that they’re just a joke, they’re also kinda MEAN.
“Oh, we’ll vote for you, Blyleven, but only when it’s your last shot. We’ll vote for you, Alomar, but you’re going to have to wait for it.”
Some of these guys bitch about the “stat-geeks” out there, but I think wielding your “power” like this is extremely “geeky.”
I have a couple of defense-related questions/observations:
1. Can anyone explain why UZR/150 ratings seem so inconsistent from season to season? If you look at UZR/150 for virtually any player, the player always seems to have both above- and below-average (or “replacement”—whatever the standard is) defensive seasons within a span of just a few years. Why is that? In terms of hitting, you obviously see season-to-season variation in a player’s performance, but above-average hitters tend to remain above average throughout their prime years. They don’t often go from being significantly above average one year to significantly below average the next and then back up to above average again. The only players you would expect to toggle back and forth this way would be truly average hitters, but even they would be expected to stay within a fairly predictable range.
2. It seems to me, intuitively, that you can only improve your team so much by adding to the defense. There’s a point of diminishing marginal returns. Take Adrian Beltre. According to one article I read, he’s so good at going to his left, he routinely fields ball hit to SS just because he can. If that’s true, he must be fun to watch; but how valuable is his ability to field balls hit to SS if his team already has a perfectly good glove at short? Similarly, how valuable is it to have an accurate infield arm if you have a 1B whose major defensive skill is the ability to scoop throws from all over the place?
Hitting stats are different in this respect, it seems to me. If you stock your team with elite hitters, you will realize all or nearly all the offensive production each individual adds. That’s because there is no theoretical limit to the number of runs you can score in a game. It doesn’t work that way with defense. No matter how good your defense is, they can still can’t record more than 3 outs per inning. And if you have a strong pitching staff that Ks a lot of batters, it’s more like 2 outs/inning.
I don’t know if or how this point is addressed sabermetrically, but it seems to me that a team that is already above average defensively would be much better off adding offense than trying to make its stingy defense marginally more stingy.
[10]
Unless hitting production is inherently less consistent than fielding production by enough to overcome the greater unreliability of fielding stats at present. I think most of us would guess that fielding, like speed, is less subject (distinguish from ‘not subject’) to ‘streakiness’ or bad seasons than is hitting.
I hate the function that automatically convert punctuation to stupid faces. (frown)
[20]
1. I think it likely has something do with the how the average moves. For example, ARod may be up an .800 ZR over 1300 innings at 3B in 4 straight years, but if the average goes up and down based on the other players in the league, his numbers will look either very good, very average, or very bad.
2. I agree that defense is subject to getting to a point of diminishing returns, but do we see any team at that point yet? The Mariners had the fewest RA last year (692). The next ‘group’ covers 732 to 771 RA and excludes only the Royals, Indians and Orioles (and Mariners.) I’d say that ‘middle of the pack’ is allowing enough runs that improving defensively isn’t very close to being overpriced for marginal gain.
[21] But, the data (taken at face value) seems to indicate fielding is less consistent than hitting.
This actually wouldn’t surprise me, if true, b/c minor leg injuries that would have almost no impact on hitting probably wreak havoc with fielding.
But, the data (taken at face value) seems to indicate fielding is less consistent than hitting.
I would argue that as of right now, the data indicate that fielding a ball is less consistently maintained skill, but that all of the blame is given to the fielder. Obviously some of the blame should be given to the pitcher.
Snapper, I don’t think this accounts for it. There were years when, according to the measure (I actually don’t remember whether it was ZR or UZR and can’t look right now), TTToorrriiiiii Hunter kept shifting from elite to criminal and back. That would have to be some real havoc-enwreakment. Especially since the presumptive injury seems to have gone unnoticed by the world.
[20] Re fluctuations in UZR, I think one is supposed to average several seasons to get a good statistical sample, and then of course there are fluctuations in player talent/health/coaching/environment that aren’t always easy to account for.
“2. It seems to me, intuitively, that you can only improve your team so much by adding to the defense.”
I think here you’re double-counting. If A takes away a play from B, I believe the standard metrics will award A and punish B. In the limit where Beltre covers the entire field from 3b, it’s true that it’s a waste to spend money to put a good glove in RF, but my sense is that this isn’t a likely scenario or a big effect compared to the large systematics in current defensive metrics.
And it should be noted that for good teams, preventing marginal runs is more important than scoring them - I calculated a factor of iirc 1.3 for the Yankees in a thread last year.
[21], and sort of [20]:
I agree that fielding ability is probably more consistent, but fielding production is less consistent. The difference is that production is a function of opportunity and ability, while ability stands alone. Opportunity is what I’m referring to when I mention the ball-in-play distribution (which is my word of the day, apparently.)
Oh, also USS Mariner commented on defensive diminishing returns the other day. The argument was basically that fielders don’t overlap that much.
It would seem that, over time, fielding should vary more or less as SB varies (at least when the SB number is sufficiently large that one or two SB isn’t a huge % change).
Over the course of a season, catcher-pitcher combinations surely vary for basestealing as much as pitchers and ball-in-play distribution vary for fielders, no?
And I would doubt that basestealers’ basestealing stats vary the way UZR & ZR tend to vary; when there’s a significant drop-off, it’s almost always connected with an identifiable (and identified) injury; absent such an injury, it just doesn’t seem to vary as UZR & ZR do.
But one can’t reasonably a priori say of a metric made by someone careful, “the uncertainty is large therefore I’m going to assume the central value is too high”.
Sure one can, especially if one can identify systemic effects as you allude to. The argument that UZR and its ilk have more uncertainty than just that which is measurable statistically is really just an argument that we are not yet accurately measuring everything that contributes to defensive effectiveness. And there’s nothing unreasonable about that kind of argument.
The HoF voters are a joke.
Ehh. Blyeleven missed by five votes. Alomar by eight. When you require 75% for election, it only takes a relative handful of jokes to screw things up. They’ll both get in next year. We can only hope that Morris’ support begins to wane.
Please forgive my total ignorance of how defensive metrics are calculated.
Does UZR (or any similar metric) take into consideration how non-out balls are played? I wonder if there are outfielders, for example, who consistently turn doubles into singles or vice-versa based on how they play balls after they drop (ignoring throwing).
To be even more explicit: a ball is hit five times each to two different center fielders in a certain spot where we wouldn’t expect an average center fielder to make an out. One center fielder always plays it on a short hop and gives up a single. Another always lays out and catches it one time out of five and gives up a double the other four times. Who looks better in UZR, and does that jibe with the expected run values for those events?
Blyleven may not demonstrate that they’re a joke, MC, but Raines surely does.
But one can’t reasonably a priori say of a metric made by someone careful, “the uncertainty is large therefore I’m going to assume the central value is too high”.
“Sure one can, especially if one can identify systemic effects as you allude to.”
Hence “a priori” - I meant, looking at the metric as a black box which turns outs or non-outs into runs prevented relative to the league there’s no reason to prefer overestimate to underestimate.
“The argument that UZR and its ilk have more uncertainty than just that which is measurable statistically is really just an argument that we are not yet accurately measuring everything that contributes to defensive effectiveness. And there’s nothing unreasonable about that kind of argument.”
I still don’t see why one gets to pick a sign without particulars of the systematics to point to, unless one consistently uses something like a Bayesian prior that all metrics have no information about performance or one has some outside accounting for the information in the system and all the correlations.
Raines demonstrates the ignorance of a large chunk of the electorate, which I do not consider a joke. At least he picked up votes over last year. He’s way ahead of where Jack Morris was in his second year on the ballot.
Alomar illustrates the “he’s not a first ballot HOFer” crap, which is a joke, but as I tried to point out, can’t be held against the 397 voters who did support him. Blyleven is a case of the pettiness and plain meanness of a shrinking fraction of the voters, so again, can’t be a valid sweeping criticism of the entire electorate.
I’m amused at the fact that many players have to “percolate” for a while before being voted in. Nothing changes in their resumes, but voters are swayed by the fact that x% of other voters had given the okay.
[34] I’m not sure what you mean by “pick a sign.” I don’t view defensive metrics as systematically over- or under-estimating actual defensive ability or performance. I view them as being inherently inaccurate, primarily because AFAICT, all chances are treated as equal with respect to all parameters other than location. So one player’s UZR may err on the high side in one season and on the low side the next.
[32]Does UZR (or any similar metric) take into consideration how non-out balls are played?
Not in the sense that you’re asking about. It’s all about whether or not a BIP was turned into an out. Plays not made are converted to runs by using an average run value for balls in a given zone (i.e.— most plays missed by SS and 2B are singles while many plays missed by OFers are XBH).
[36] Well, if you’re amused then maybe it is a joke.
I’m amused at the fact that many players have to “percolate” for a while before being voted in. Nothing changes in their resumes, but voters are swayed by the fact that x% of other voters had given the okay.
Except for the first year, my guess is that percolation happens because most writers vote based on how well they remember the name of the guy. After 5 years with no news stories, you need to have 3-7 years of “didn’t make it” stories to reenter the BBWAA collective consciousness.
Aside from the empirical data related issues such as how wide or narrow a confidence interval is, one should be able to assess how valid a measure is based on it’s metrological properties. What is important about the a priori situation is what the methodology says about the metric, it’s theoretical basis and it’s development.
For example, one could have a much better sense of how valid UZR derived inferences are if the interrater reliability of the metric was well defined. I may just not be familiar with that literature, but I haven’t seen much data defining the metrological basis for UZR.
[34] “I’m not sure what you mean by “pick a sign.””
Maybe I’m wrong, but I wasn’t hearing “His UZR is +15 but I think that means he’s really +5 to +25” above, e.g. here:
“For instance, I’d have no problem giving the Gold Glove to Franklin Gutierrez but I’m wouldn’t be handing him a $20M AAV free agent contract or giving him top 10 MVP votes or treating him as equal to a player like Matt Kemp.”
This sounds like an “overestimate” stance to me, or perhaps one that is tacitly mixing risk projection and value projection together. No one paying attention claims offensive and defensive metrics have equivalent relative uncertainty.
[41] Eh, maybe you’re not wrong. I wasn’t speaking for him, but FWIW I’d guess that his position is more “mixing risk projection and value projection together” rather than taking an “overestimate” stance on UZR.
I’m amused at the fact that many players have to “percolate” for a while before being voted in. Nothing changes in their resumes, but voters are swayed by the fact that x% of other voters had given the okay.
I’ve seen this theory presented in a few different places now including on MLB network (by I think Costas) right after the vote announcement. It’s all just speculation to take it for what you will but I found it interesting.
The theory is that voters treat it like different levels of the Hall of Fame. Not just for the first ballot but for all the ballots.
First Ballot are you clear best players, your “small hall” players if you will, where as especially borderline guys like Rice have to wait to their very last ballot.
I know Rice tried to pretend that there is no difference between a first ballot HOFer and a last ballot HOFer in his speech, that they were all just HOFers but that doesn’t seem to be entirely accurate. Some people seem to think that votes happen like that for a reason.
Blyleven may not demonstrate that they’re a joke, MC, but Raines surely does.
I’d go with Segui myself.
I’d be interested in knowing how much turnover there was in the electorate between Rice’s first and last years on the ballot.
Although much was written about him being an “old-school” candidate and some have speculated that he benefited from an anti-stats backlash, I have to think that part of the slow increase in support for a guy like him is just a result of writers who were young and impressionable when he was in his prime finally getting to vote.
I wonder how big a % that might be, and I wonder what the real “old-school” guys who didn’t support Rice in his first few tries (and died before he got elected) would have thought about him getting in.
“I know Rice tried to pretend that there is no difference between a first ballot HOFer and a last ballot HOFer in his speech, that they were all just HOFers”
I didn’t read the speech, but having to make such an argument pretty much means the argument is wrong.
“guys who didn’t support Rice in his first few tries (and died before he got elected)” [em. added]
Maybe “most feared hitter” is more accurate than I thought.
[47] If I was a writer I wouldn’t be able to scribble down Gary Sheffield’s name fast enough. No way I’m messing with him.
This sounds like an “overestimate” stance to me, or perhaps one that is tacitly mixing risk projection and value projection together. No one paying attention claims offensive and defensive metrics have equivalent relative uncertainty.
Initially that was all I was getting at. Going from Bay to Cameron basically shifts around 40 offensive runs to defensive runs. Since you agree they don’t have equivalent relative uncertainty then you are affirming my original point which was
“The Red Sox are being smart about it and going for short term deals but still, that is pretty big risk no?
However then I did continue to argue that in addition to the risk concerns that there were also value/accuracy concerns. To compensate for those concerns and risk I would be more likely to drift towards the under on UZR values.
LoHud has some notes from Cash.
[49] The first part sounds ok to me, as long as we’re all clear we’re talking about risk aversion, though I wouldn’t sign on to “big” necessarily [even if LF in Fenway is a special situation - I assume their FO has done its work]. And note that for a team as good as the Red Sox, 40 extra runs prevented is about equal to 52 extra runs scored by my toy calculation. And of course one needs to compare the contracts involved - maybe the downside risk is entirely covered by definite improved resource allocation elsewhere.
As far as I can tell “value/accuracy concerns” need some at least handwaving support to assign a sign.
Per [50] - “I don’t need a left-handed bat for the outfield,” Cashman said.
Damon, change of address forms are available at every USPS location, although I can’t guarantee those locations are merman accessible.
note that for a team as good as the Red Sox, 40 extra runs prevented is about equal to 52 extra runs scored by my toy calculation
So then trading 40 runs of offense for 40 runs of defense makes them one win better. I can certainly see how the greater inherent uncertainty in whether you’re really going to get those 40 runs on defense could make one lean the other way.
As far as I can tell “value/accuracy concerns” need some at least handwaving support to assign a sign.
Well in a perfect world there would be some support provided to validate the stat before it was accepted. Maybe there is and I don’t know it but were there any attempts by the creators to validate the stats accuracy? Have there been anything released that shows a correlation coefficient for team UZR to lower/higher than expected team ERA? Have there been any attempts to define the standard deviation or a 95% confidence intervals given the number of chances?
[54] Pretty sure all that sort of stuff is available if one looks.
Incidentally, from tangotiger in comments to this post:
“That is, whatever you think of someone with 450 PA, that’s how you think of someone with 250 UZR chances. They are equally reliable.
[...]
When I’ve done the correlation, I get r=.50 for UZR at a mean of around 100 games. (And, as I said about 50 games for wOBA.)”
But, Ted - didn’t we determine that, if Cashman says that he DOESN’T want Damon, that he’s actually on the verge of signing him?
Also, post 38 is such a joke…
[55] This is only meaningful if you believe that UZR describes what a fielder does defensively and the effect it has on runs allowed as well as wOBA describes what a hitter does offensively and the effect it has on runs scored. No?
One point I have not seen made here:
We already know that defenive metrics come with large uncertanity. Therefore, if we look at how MLB players have performed as a group, one would guess that the outliers of the sample are the least reliable measures, no? Even more so for players with a small track record. Gardner is probably an above average defender, but his numbers were so off the charts that they should be corrected down, not only by just the regression weights used for hitting, but also by adjusting for “winner´s curse”. And winner´s curse is strongly dependent on how much info is lacking. Disclaimer: I don´t know much stats, I may be completely wrong here, just using my math intuition
My point being that if you biuld a team around excellent defense as measured by UZR, than you must care to discount your projections for this effect.
Also, even though for the Yankees a run saved is worth more than a run scored now, even if it is by a factor of 1.3, one must remember that a player neutral offensive production ability is now enhanced by being put into the highest scoring offense of the majors, and that his defensive prowess is going to be weighted down from being in one of the better run prevention enviroments.
What I mean is: Lets say the Yankees are as of now expected to score 6 runs per game, and to give 5 (indulge me).
Then a double by the yankee hitter is worth .826 runs, an out is worth -.363, a total difference of 1.189 while taking away a double in the gap saves the team .781 runs - (-.304) = 1.085, a factor of 1.189/1.085= 1.09 There is a nonlinear contribution effect to be accounted, and it is not trivial.
‘the outliers of the sample are the least reliable measures, no? Even more so for players with a small track record. Gardner is probably an above average defender, but his numbers were so off the charts that they should be corrected down, not only by just the regression weights used for hitting, but also by adjusting for “winner´s curse”.’
I think this is mostly correct, but we in fact have exterior info about Gardner which is useful - he’s definitely very fast, which is of clear value to a CF. I don’t see his rate stats as being especially unusual, and for such a speedy guy they’re probably not that surprising. And we have info from previous years which has him as an exceptional defender. With a prior that he’s very fast and appears to be a good defender, even regressing him towards average may not be the best approach.
“one must remember that a player neutral offensive production ability is now enhanced by being put into the highest scoring offense of the majors, and that his defensive prowess is going to be weighted down from being in one of the better run prevention enviroments”
I don’t understand this. My toy MC used pythagenpat and compared a marginal run prevented or scored around values near the 2009 Yankees - I don’t know what correction to this model is appropriate.
I view them as being inherently inaccurate, primarily because AFAICT, all chances are treated as equal with respect to all parameters other than location.
I believe that is true for ZR. UZR takes into account factors like velocity of the ball (I think just “hard”, “soft”, etc), positioning of the fielder (standard adjustments using batter/pitcher handedness, men on base, etc), etc. And then +/- does something completely different, using video and computers (to more precisely determine angles and velocity and such) along with human subjectivity to analyze plays and determine if the player “should” have had it. We generally compare with UZR since it is freely available and updated regularly. To say all chances *in a particular bucket* are created true are equal. But the buckets are fairly granular, e.g. “soft GB hit to zone y by lefty pitcher with righty batter with no one on base”, I think is a bucket.
Does UZR (or any similar metric) take into consideration how non-out balls are played?
Even though it has little to do with his throwing arm, I believe (am not 100% sure) that will go into his ARM component of UZR. ARM take into account both kills, and bases *not* advanced. So theoretically your dive-happy CF would get more points for his “range” rating, but lose bases in ARM by allowing extra doubles and triples. The “safe” OF would lose (more precisely not gain) in range, but gain (or not lose) in ARM. I don’t know if they would equal out.
Can anyone explain why UZR/150 ratings seem so inconsistent from season to season?
I linked the other day to a “10 questions” thing Tom Tango did, and one of them was on UZR. Dave Cameron in the comments pointed out several players (he claims he selected them off the top of his head) that had wRC differences pretty large, I think > 30 runs between two seasons. I tried that myself - pick a person off the top of my head - and thought of Adam Dunn, always noted for consistency on offense. Between 2005 and 2007, using FanGraphs wRAA (so against average)...+36.4, +19.8, +36.2. Same team, I believe same park. And we accept that. But if we saw a fielder who was +7, -10, +7, we’d be up in arms about UZR being inconsistent?
Well, 36.4,19.8, 36.2 has as much, much smaller noise/signal than 7, -10, 7. The noise is still the same though.
[60] IMHO, “standard adjustments using batter/pitcher handedness, men on base, etc” are not particularly good proxies for “positioning of the fielder” but YMMV. I don’t think the raw data is nearly granular enough, and last I heard neither did the folks who have developed and refined the metrics. Although I’ll admit that I haven’t kept up with this stuff as well as you.
As for “accepting” big swings in offense, I’d say that you’ve got the issue exactly backwards (at least for the context in which the issue was raised here)—everyone does accept that a player’s offensive productivity can vary from season to season, but there are some who seem to be claiming to know that a player will save his team x runs per 150 games, period. Not necessarily anyone here, but certainly some of those who are drooling over the brilliance of young master Theo’s defense-first strategy for 2010.
[62] I think everyone would like “better” ways to handle positioning of fielders. The problem is 1) we don’t have the data, at least consistently, of where the fielders were *actually* positioned 2) was the positioning in question (if non-standard) due to the coaching or the fielder? I.e. if the coaching staff allows the fielder to position himself, does the fielder often position himself better or worse than an average fielder would? 3) If we’re measuring what *does* happen, we don’t want to over-compensate. For example, if a player grounds out to a 3rd basemen who was positioned perfectly for it, he’s still out. A lot of it I think was a compromise - based on the data available they (mostly MGL) felt the numbers more matched reality with these adjustments.
And for the second part, I’m not sure of the exact question being asked then. Often, there has been the complaint that UZR stats aren’t stable enough since players show up as VG one year and then poor the next. And that they aren’t as reliable as offensive stats b/c of this. I’m pretty sure several people believe that. And I think the main problem is - as Rilke has alluded to - the sign. If a player is +20 runs offense one year, and +30 the next year, well he still had a good year. If a player is +5 defense one year and -5 next year, it *feels* like (the metric is claiming) he had a good year the first one and a bad one the second one.
Now, I KNOW - I mean KNOW - that isn’t the case, that in both cases UZR is claiming he is about average (whether UZR is right is a different story). But sure, when I see player A as +5 and player B as -5 I think of player A as good and player B as bad, even though I know that is not correct.
Next entry: So How Good Might the 2010 Red Sox Defense Be?
Previous entry: 2009 vs. 2008 Yankees by Position
There are currently 70 visitors who are not logged in.
There was a record 241 simultaneous visitors on May 2, 2011 at 11:54:25 pm.










