Wednesday, January 6, 2010
So How Good Might the 2010 Red Sox Defense Be?
As Yankee fans, we generally keep an eye on our divisional rivals in the North East. In 2009, the Red Sox fell short of a strong Yankee team before bowing out to the Los Angeles Angels of California in the ALDS. While the Red Sox had a good overall season winning 95 games and taking the wild card, one area where they were pretty poor was on defense.Here's how the Red Sox rated at each position defensively in 2009 using standard zone rating and UZR.
| TM | POS | zRS | uRS | aRS |
| Bos | 1B | 3 | 5 | 4 |
| Bos | 2B | 7 | 9 | 8 |
| Bos | 3B | -7 | -12 | -9 |
| Bos | CF | -11 | -20 | -15 |
| Bos | LF | -21 | -12 | -17 |
| Bos | RF | -1 | 9 | 4 |
| Bos | SS | -7 | 8 | 0 |
| Bos | Total | -36 | -14 | -25 |
zRS: Defensive runs saved above average using zone rating
uRS: Defensive runs saved above average using UZR
aRS: Average of zRS and uRS
The primary big disparity between ZR and UZR is the Green Monster. Standard zone rating counts chances off the wall as playable for some bizarre reason, whereas UZR does not. Shortstop also seems diametrically opposite. When UZR is better than zone rating, it usually means the position saw a higher than normal distribution of difficult chances. Zone rating treats all chances the same, whereas UZR adjusts for batted ball velocity, handedness of the batter/pitcher and GB/FB tendencies which should help adjust for the difficulty of chances.
Because of the issues with LF and SS, the Red Sox were probably closer to a -14 team than a -36 team. Either way, they weren't very good.
Jason Bay has generally not been a good defender, and as a Met he's no longer their problem. Even though Jacoby Ellsbury is really fast and looks like a good defender, the metrics were less than impressed. Mike Lowell's hip issue severely impacted his lateral range, and it looks like he's not long for Boston at this point.
In signing Mike Cameron, Adrian Beltre and Marco Scutaro as the replacements for Bay, Lowell and Nick Green, the Red Sox signed three players who are good defenders. Adding them to Kevin Youkilis, Dustin Pedroia and J.D. Drew who are all good defenders appears to turn a Red Sox weakness into a strength.
So how much of a strength is it? Here's how CAIRO has what looks like their primary roster projected offensively and defensively.
| Player | Pos | PA | AVG/OBP/SLG | BR | Outs | BRAR | RS | WAR |
| Jacoby Ellsbury | lf | 640 | .291/.348/.403 | 85 | 418 | 18 | 2 | 2.1 |
| Dustin Pedroia | 2b | 650 | .304/.373/.458 | 96 | 408 | 34 | 8 | 4.2 |
| Victor Martinez | C | 575 | .299/.376/.474 | 85 | 359 | 35 | 0 | 3.5 |
| Kevin Youkilis | 1b | 615 | .290/.392/.499 | 99 | 374 | 29 | 4 | 3.3 |
| David Ortiz | dh | 600 | .264/.371/.507 | 94 | 377 | 21 | 0 | 2.1 |
| J.D. Drew | rf | 525 | .269/.383/.472 | 79 | 324 | 23 | 2 | 2.5 |
| Mike Cameron | cf | 600 | .244/.332/.431 | 77 | 401 | 19 | 5 | 2.4 |
| Adrian Beltre | 3b | 575 | .264/.315/.439 | 66 | 397 | 8 | 8 | 1.6 |
| Marco Scutaro | ss | 560 | .273/.354/.388 | 67 | 362 | 18 | 3 | 2.0 |
| Starters Total | 5340 | .277/.360/.449 | 748 | 3418 | 204 | 33 | 23.8 | |
| Player | Pos | PA | AVG/OBP/SLG | BR | Outs | BRAR | RS | WAR |
| Jeremy Hermida | lf | 355 | .259/.340/.427 | 45 | 234 | 8 | -3 | 0.5 |
| Bill Hall | 3b | 234 | .218/.287/.390 | 23 | 167 | 0 | 2 | 0.2 |
| Jason Varitek | c | 215 | .221/.325/.368 | 23 | 145 | 4 | 0 | 0.4 |
| Jed Lowrie | ss | 120 | .251/.332/.414 | 15 | 80 | 4 | 0 | 0.4 |
| Tug Hulett | 2b | 80 | .235/.308/.359 | 8 | 55 | 0 | 0 | 0.0 |
| Bench Total | 1004 | .238/.321/.399 | 114 | 682 | 16 | -1 | 1.5 | |
| Player | PA | BR | AVG/OBP/SLG | Outs | BRAR | RS | WAR | |
| Team Total | 6344 | 861 | .271/.354/.441 | 4100 | 220 | 32 | 25.5 |
BR: Absolute linear weights batting runs based on estimated playing time, not adjusted for position.
Outs: Outs made while batting. Team outs should add up to around 4100 over a full season.
BRAR: Batting runs above replacement level at position.
RS: Runs saved compared to average, using an average of zone rating and UZR pro-rated to projected playing time.
WAR: Wins above replacement (BRAR + RS).
Moving Ellsbury to LF from CF makes him a plus defender according to my projections, so it looks like the Red Sox can run out a defense that is at least average at every position (I'm not including catchers for now, so Victor Martinez may change that). If these numbers are to be believed, the red Sox have probably made themselves 40 to 50 runs better defensively with the moves they've made this offseason.
Standard caveats about defensive metrics having more uncertainty than offensive or pitching metrics apply here, so don't take this as definitive proof or anything.
I haven't really finalized their pitching depth chart so I'm not going to post it yet, but with the one I have worked up they look like a .598 Pythagenpat team right now, which is .025 points worse than the Yankees were when I ran their numbers. That's the difference between a 97 win team and 101 win team in a neutral league, though we probably want to knock off a couple of wins frome each team to account for being in the AL East.
So right now, I still think the Yankees are better by a few games, but in a 162 game season that's not much of a difference, and of course a lot can change between now and the end of the 2010 season.
Comments
I’m assuming your ignoring catcher defense, correct? V-Mart getting 575 PA as a C and being exactly average behind the dish? I’ll take the under on both PA as a C and RS for him.
Copying it from the previous thread:
One point I have not seen made here:
We already know that defenive metrics come with large uncertanity. Therefore, if we look at how MLB players have performed as a group, one would guess that the outliers of the sample are the least reliable measures, no? Even more so for players with a small track record. Gardner is probably an above average defender, but his numbers were so off the charts that they should be corrected down, not only by just the regression weights used for hitting, but also by adjusting for “winner´s curse”. And winner´s curse is strongly dependent on how much info is lacking. Disclaimer: I don´t know much stats, I may be completely wrong here, just using my math intuition
My point being that if you biuld a team around excellent defense as measured by UZR, than you must care to discount your projections for this effect.
Also, even though for the Yankees a run saved is worth more than a run scored now, even if it is by a factor of 1.3, one must remember that a player neutral offensive production ability is now enhanced by being put into the highest scoring offense of the majors, and that his defensive prowess is going to be weighted down from being in one of the better run prevention enviroments.
What I mean is: Lets say the Yankees are as of now expected to score 6 runs per game, and to give 5 (indulge me).
Then a double by the yankee hitter is worth .826 runs, an out is worth -.363, a total difference of 1.189 while taking away a double in the gap saves the team .781 runs - (-.304) = 1.085, a factor of 1.189/1.085= 1.09 There is a nonlinear contribution effect to be accounted, and it is not trivial.
Looking at Youkilis’s WAR over the last few seasons at Fangraphs, 3.3 WAR seems a little low.
SG, do you use the same regression to the mean weight for both offense and defense? Did anyone study how these behave?
This is probably a very hard question, and there are several model assumptions lacking, but lets say we take the 30 or so CF out there, and assume that they UZR in 2009 is composed of their true talent UZR plus white noise of SD 10 runs. If the best Cf had a 40 UZR/150, then probably his true talent level is very much closer to 20-30 than 40. Is there a simple way to estimate what it should be?
My best guess is that if you assume that the true talent distribution is a ND with mean zero and SD_T 15, then the observed distribution is a ND with mean zero and SD_o =sqrt(15^2+10^2), so you should scale the observed result down by the factor SD_t/SD_o.
[3] Agreed. Despite the hate thrown his way by many Yankees fans he’s a very valuable player. Mid to low 4’s best case. (for us). No surprise if it ends up higher.
Y- love the winners curse stuff. Third paragraph of [4] had me heading out for a Duvel and turning Fox News on. OK joking about the Fox News part.
I’m assuming your ignoring catcher defense, correct?
Yeah, it’s buried in the entry that I’m not including C defense right now, I should have probably spelled it out more explicitly.
SG, do you use the same regression to the mean weight for both offense and defense? Did anyone study how these behave?
I use similar weights as far as weighing the past four seasons, but the regression to the mean for hitters and pitchers is more involved than the one I use for defense. For hitters and pitcher I regress towards a few different means, including position/role and age. The regression component for defense is just adding in some league average performance. The smaller the sample size, the more a player is regressed to the mean. The regression I use keeps the range of player defensive projections in the -20 to +20 range, because I don’t think we can realistically project anyone outside that range.
Looking at Youkilis’s WAR over the last few seasons at Fangraphs, 3.3 WAR seems a little low.
Yeah, the regression towards the mean plus the fact that he’s past peak age means CAIRO expects him to be a little worse this year. I know that CHONE has a very similar projection for him.
What does a replacement 1B look in CAIRO? How many BR?
What does a replacement 1B look in CAIRO? How many BR?
Something like .249/.310/.410 depending on the park, and around 75 BR in 650 PAs.
[6] - No, I should have read more carefully. I thought I went back to check, but apparently I can’t double check correctly either. Thanks.
[8] A career year by Dougie Mancaveitch!!!
[10] .271/.360/.405.
Youkilis’s offensive projection looks awfully low. He’s sluggged well north of .500 in his age 29 and age 30 seasons. 50 points of SLG is quite a regression, no?
Pretty sure all that sort of stuff is available if one looks.
Incidentally, from tangotiger in comments to this post:
That speaks to the wild fluctuations year to year. That would talk about how likely that stat is to be a repeatable skill but that does not speak to the accuracy of the statistic itself. For instance, I can usually find charts like these for a lot of offensive stats that show how well certain stats correlate to runs scored over the history of baseball (or as far back as they can calculate).
http://www.baseballprospectus.com/article.php?articleid=2596
I’ve searched but seen no such paring of a pitching stat combined with UZR that shows a correlation for how accurately the account for runs allowed. So we can speak to the accuracy of the offensive stats but until we can see the same for the defensive stats it’s hard to say how seriously we should take the stat. I mean is it batting average accurate or is it OBP*1.8+SLG (wOPA) accurate?
You seem pretty confident in the stat. Why? If you know of where I can find these numbers can you point me in the right direction?
Oops. I missed the response in [6] re: [12].
[2] Tit for tat:
’the outliers of the sample are the least reliable measures, no? Even more so for players with a small track record. Gardner is probably an above average defender, but his numbers were so off the charts that they should be corrected down, not only by just the regression weights used for hitting, but also by adjusting for “winner´s curse”.’
I think this is mostly correct, but we in fact have exterior info about Gardner which is useful - he’s definitely very fast, which is of clear value to a CF. I don’t see his rate stats as being especially unusual, and for such a speedy guy they’re probably not that surprising. And we have info from previous years which has him as an exceptional defender. With a prior that he’s very fast and appears to be a good defender, even regressing him towards average may not be the best approach.
“one must remember that a player neutral offensive production ability is now enhanced by being put into the highest scoring offense of the majors, and that his defensive prowess is going to be weighted down from being in one of the better run prevention enviroments”
I don’t understand this. My toy MC used pythagenpat and compared a marginal run prevented or scored around values near the 2009 Yankees - I don’t know what correction to this model is appropriate.
Better defense means starting pitchers go more innings, which leads to bottom relievers pitching fewer innings. Has anyone measured this effect? Is it already included in defensive metrics?/Is it significant?
I don’t understand this. My toy MC used pythagenpat and compared a marginal run prevented or scored around values near the 2009 Yankees - I don’t know what correction to this model is appropriate.
I’m not sure either, but offense gets enhanced because, for example, a player not getting out means that somebody else on his team gets an extra PA, and good offensive teams do more with their PAs than bad offensive teams. When the Yankees add a run (as estimated through linear weights created for all teams) they are really getting 1.x runs. I have no idea if that enhancement is as much as the pythagenpat difference between saving and scoring runs on good teams.
[13] Giving the baby a botle with one hand, but here’s the very first google for “uzr test” for example.
Don’t have anything at my fingertips beyond a suggestion to read The Book, but I’ll keep my eyes open.
“That speaks to the wild fluctuations year to year.”
“Wild” in the sense two years are needed to be as stable as one year of wOBA.
“You seem pretty confident in the stat. Why?”
It’s made by somebody smart and has been used by smart people for years (and the predecessor system has an even longer record), and it has competition from more complex (in part) and much simpler systems. I’m not exactly “pretty confident” in it since I haven’t looked hard or otherwise at the data and no high energy physicists were involved in the development afaik, so I’d be wary of the estimate of the systematic uncertainty, but I see no reason to assume the central value is biased in either direction.
[16, 17] I’ve thought on occasion about both effects, but assuming the pythagpat people weren’t stupid this should be covered well enough by the empirical formula, though there are certainly higher-order effects which are known to be useful (e.g. linear weight runs instead of real runs).
Incidentally that reminds me that I still want to see an MC study of the value of a marginal pitch/PA and a calculation of the # of P/PA needed to make a 0. wOBA player useful.
[5] Waitaminnit. OTF with the numbers now? Did someone take over his keyboard?
Perhaps the reason Yankees fans hate him so much is BECAUSE he’s so valuable. I don’t know, the guys just seems like a dick. But I have a hazy awareness that fans of other teams may have perceived Paul O’Neill as a dick, so…
Is there any place with historical WAR? I’m curious about O’Neill but didn’t see that # at fangraphs.
Youkilis is about to turn 31 and he was not a regular in the majors until he was 27. It makes all the sense in the world to expect him to start declining. Not fall off a cliff immediately or anything, mind you, just a little slipping each year. It certainly shouldn’t come as a shock if he never has another season as good as his last two.
I’ve searched but seen no such paring of a pitching stat combined with UZR that shows a correlation for how accurately the account for runs allowed.
It’s virtually impossible to fully separate pitching from defense, and it probably always will be. A great defensive team will still suck at run prevention if the pitchers give up lots of a walks and home runs, and never strike anybody out. So the correlation between any defensive metric and run prevention will always be confounded by the quality of the pitching, not matter how sound the metric might be.
[15] I know that for the Yankees -1RA>1RS, if you believe pythagenpat. Remember, I was the one making this case with you.
But what I mean is that, as a decision on what to do, if you look at what 2 players might project in a vacuum:
Player A +10BR -10runs saved
Player B 0 BR o runs saved
then, since the Yankees lineup is better than average and the yankees pitchers are also better than average, this should translate to:
Player A instead of player B makes the Yankees score something like 12 runs more in the season, and to allow 8 runs more in the season.
[2 & 15] Hah! I read the other thread and posted my response there, never imagining that discussion would continue here. I’m *not* going to copy the book I wrote, so if anyone cares it is on the other thread.
Here are some links I got from Googling. I don’t know if any of them do the tests some want, but one of them is MGL’s (original?) article describing UZR (I read that way back in 2003 myself), and one is an interview with MGL. I think these can answer a lot of questions with the *how* of UZR, even if not the *why*.
2009 Yankee defense projections w/ SG discussing UZR
2003 article by MGL describing UZR
Also, MGL did an interview HERE several years ago. SG can probably find that faster than I can b/c I think it was on the old site. I’m sure if I reread it I’ll some embarassing (for me) questions for him.
I know when I read the original article in 2003 I was very skeptical (I still thought Jeter did enough of the “little things” to be a good defender). However, I was intrigued enough by it to follow discussion here, on THT, and other sites for the past 6+ years, so I’m fairly well convinced. While not perfect, it is very, very good. Quite probably the best we’ve got right now - +/- may be better, but I haven’t read The Fielding Bible 2 to know if they’ve improved it and also the numbers aren’t freely available. Certainly better than thinking a player is a great defender today b/c he was great 5 years ago, or b/c he *looks* great making diving catches on balls other fielders catch easily.
[6] Thanks.
SG, sorry to bother you with these questions, but when you calculate the BR for a player in the Yankees, do you adjust it for the expected run enviroment, i.e., do you use the linear weights for what you expect the team to produce, or do you use the linear weights for the expected league average run enviroment? I know the projections change but is only park factors and such? It guess it wouldn’t change the batting line of players, just the expected runs scored above replacement.
If you do it for offense, when you do defense and pitching, do you do the same?
Not sure it would matter anyway.
Each team having their own run environment is why pythagenpat does (might) not work for comparing the 10 run offense player to the 10 run defense player.
The 10 run offense player would on average create more real runs on a good team than on a bad team. But that is counterbalanced by a marginal run saved being better than a marginal run scored in pythagenpat.
A simulation should be able to estimate the relative importance of these factors. It might be messy though.
Youkilis’s offensive projection looks awfully low. He’s sluggged well north of .500 in his age 29 and age 30 seasons. 50 points of SLG is quite a regression, no?
Well, he slugged .429 and .453 in his age 27 and age 28 seasons, which are also part of his projection. A simple weighted average of the last four years puts him at a SLG of .519, then he loses a few singles because of aging and a few more hits in general because of regression. Is it likely he’s established a new level of performance in 2008 and 2009 that means we shouldn’t really factor in 2006 and 2007? Sure, probably, but I wouldn’t change a system designed to work for the general population of MLB players to account for that. I’d just note that there will be projections I don’t agree with. For example, I think Ortiz will be worse than that projection.
SG, sorry to bother you with these questions, but when you calculate the BR for a player in the Yankees, do you adjust it for the expected run enviroment, i.e., do you use the linear weights for what you expect the team to produce, or do you use the linear weights for the expected league average run enviroment?
I use league and park factors, but not anything specific with the team. So using someone like Adrian Beltre as an example, he projected to hit .254/.309/.415 with Seattle, a line worth something like 64 BR. Move him to Boston and his line goes to .264/.315/.439 and 66 BR. However, in both cases his BRAR remains at eight, because the replacement level for offense is higher in Boston than in Seattle.
I do something similar when calculating runs saved by a pitcher, but I don’t do a defensive park/league adjustment since defensive metrics are still somewhat inprecise. So say you had Kei Igawa pitching in Colorado, he’d project to have an ERA of around 6.65 and be worth about 20 runs below replacement level. Move him to San Diego, he’d have an ERA of 5.40 but still be worth close to 20 runs below replacement level.
I think this season might serve as an interesting test case for just how valuable defense is, assuming that every Sox player plays all world defense, as ESPN tells me they will. Seriously though, it will be interesting to follow.
It’s virtually impossible to fully separate pitching from defense, and it probably always will be. A great defensive team will still suck at run prevention if the pitchers give up lots of a walks and home runs, and never strike anybody out. So the correlation between any defensive metric and run prevention will always be confounded by the quality of the pitching, not matter how sound the metric might be.
Yes and no. If you pair them together you can come up with good system of estimating runs against. We do have ways to trying to estimate the pitching contribution independent of fielding. That’s what DIPS, FIP, xFIP and the like attempt to do. Now fresh off declaring I almost always look at xFIP I’m going to argue that in this case FIP is the better choice however it is specifically for the reason why I like xFIP better. The HR component isn’t normalized so it doesn’t factor out luck. If we are trying to isolate defense then we shouldn’t remove luck from the pitchers component especially for HR since fielders have almost zero influence over that.
In that respect the article that rilkefan posted was on the right track but his method was complete crap (no offense to rilkefan, I know you didn’t write it and I haven’t been able to find anything better either, that’s why I asked). What he should have done was convert team FIP to RA, subtracted that actual RA (not ERA/ER against) and come up with the estimate of runs saved by the defense. Then use that number to generate your r or r^2 with UZR.
Now for the sake of completeness you should do that for all other fielding independent pitching stats and also all other defensive stats (including methods like SG’s average of the two) and do it for the entire history available for those stats. We shouldn’t assume FIP+UZR is the best pair because we think it should be. There might be a better pair but whatever the best pair, you will be able to compare that level of accuracy to the high standard set by the offensive stats. You’ll know how they stack up.
There is an extensive literature in the social sciences on how to create validated metrics for constructs that are analogous to defense in baseball. The validity of a metric is antecedent to how it behaves statistically (e.g. year to year fluctuations).
The basis for constructing a high quality metric is to have a widely validated conceptual model underpinning the measure. Without such a validated conceptual model it is very difficult to address downstream analytic sources of bias e.g. omitted variable bias.
I’ve looked and have not seen this kind of work done for UZR. For example, I haven’t seen formal assessments of UZR’s construct validity and that’s not for lack of trying.
We do have ways to trying to estimate the pitching contribution independent of fielding. That’s what DIPS, FIP, xFIP and the like attempt to do.
No, that’s not at all what they do. Evaluating pitchers in a defense-independent way is not the same thing as trying to correlate pitching-independent defense with total run prevention.
convert team FIP to RA, subtracted that actual RA (not ERA/ER against) and come up with the estimate of runs saved by the defense. Then use that number to generate your r or r^2 with UZR.
This would be circular. It begs the question in the true sense of that term. Assume that FIP should have near-perfect correlation with RA and then blame all deviation from the FIP RA prediction on the defense.
It just doesn’t make sense to me to expect this kind of construct to stand up to offensive stats, simply because the offensive stats are taken as they are, rather than being judged by what they should have been. We give batters full credit for all of their hits when we total up their batting runs. We don’t say that a particular BIP is worth 79% of a single because it should have been converted to an out 21% of the time.
Isn’t it inherently unfair to think that hypothetical events will correlate as well with run prevention as actual events correlate with run scoring? When we get around to using hit f/x to evaluate both batters and fielders, then I’d expect the metrics developed and refined from that data to work comparably well for both offense and defense.
One major problem with combining defensive stats with fielding independent pitching staff is separating out fieldable BIP from non-fieldable. We shouldn’t assume that every hit that is allowed could have been prevented by the defense, and we also shouldn’t completely ignore the fact that there are varying abilities in pitchers to control the results on balls in play, even if it’s not a large ability.
FWIW, in a thread at Baseball Think Factory discussing WAR, I looked at FIP + UZR versus actual runs allowed for all teams in 2009, and there’s a pretty big range of differences.
Team: FIP - UZR / RA / Diff
Mets: 822 / 757 / 65
Dodgers: 676 / 611 / 65
Cubs: 737 / 672 / 65
Twins: 808 / 765 / 43
Cardinals: 682 / 640 / 42
Reds: 763 / 723 / 40
White Sox: 767 / 732 / 35
Phillies: 738 / 709 / 29
Blue Jays: 796 / 771 / 25
Braves: 665 / 641 / 24
Yankees: 775 / 753 / 22
Orioles: 892 / 876 / 16
Brewers: 829 / 818 / 11
Giants: 621 / 611 / 10
Rangers: 745 / 740 / 5
Tigers: 748 / 745 / 3
Angels: 763 / 761 / 2
Astros: 769 / 770 / -1
Red Sox: 734 / 736 / -2
Padres: 767 / 769 / -2
Mariners: 684 / 692 / -8
Indians: 853 / 865 / -12
Rockies: 702 / 715 / -13
Nationals: 856 / 874 / -18
Marlins: 745 / 766 / -21
Pirates: 737 / 768 / -31
Royals: 801 / 842 / -41
Athletics: 711 / 761 / -50
Diamondbacks: 712 / 782 / -70
Rays: 684 / 754 / -70
The first number is FIP converted to runs allowed. I divided FIP by 0.92 to move it from an ERA to an RA scale, divided it by nine then multiplied it times the innings pitched to get runs. I then subtracted the team’s total UZR from that number. So if a team was a + 20 defensive team and allowed 800 FIP runs, they’d instead show as 780 runs allowed.
The second number is the actual runs allowed, and the third number is the difference between the two. All numbers were pulled from Fangraphs.
In theory, a team with a positive difference allowed fewer runs than you’d expect looking at their FIP and UZR, and a team with a negative difference allowed more.
So, in addition to 12 aces, the Red Sox have 13 gold glovers?
[34] Nah, only 12. Papi is only an above average 3rd baseman at this point in his career.
Have there been studies that sought to correlate, for example, UZR for a given set of players with a rating system used by scouts?
[34] Nah, only 12. Papi is only an above average 3rd baseman at this point in his career.
Ironically it is because of the eyedrops, the ball looks blurry on a short hop.
Rich, thanks for the link.
[36]
I have to admit I am really struggling to follow the conversation here, because I am old and feeble, but this seems like a really interesting idea.
So, in addition to 12 aces, the Red Sox have 13 gold glovers?
Boston’s fielding is so good they even have gold glove DH.
[36] I’ve been doing some searches to see what is out there, and I did find some articles that compared UZR (and other metrics) to the Fans Scouting Report (FSR) used by Tom Tango. I think THT (for whatever timeframe they compared) had like .42 r^2. I’m not an expert on that suff, but I believe that means they are fairly well correlated. Not strongly, but fairly well. I’ll see if I can go back and find that.
In the meantime, here is part 2 of MGL’s description of UZR, which includes (among other things) park factors. I believe a number under 1 should improve a player’s UZR, and LF in Fenway has .85, by far the lowest PF. So if I’m reading that right, he is adjusting for Fenway, and in fact he’s doing a very LARGE adjustment for Fenway. It’s still possible that adjustment is not sufficient, but…it’s also possible that the adjustment is correct but the players with the most time in LF the past few years were particularly ill-suited to play there.
Now that I think about it, could UZR and FSR correlate because the people doing FSR are the kind of people who might be interested in advanced stats like UZR, and trust them over their eyes?
I think THT (for whatever timeframe they compared) had like .42 r^2.
In order for that kind of analysis to yield valid findings, the individuals participating in the Fans Scouting Report would need to be blinded to the UZR data. Otherwise they would could be subject to external event or contamination bias.
So if I’m reading that right, he is adjusting for Fenway, and in fact he’s doing a very LARGE adjustment for Fenway.
Or that the adjustment is correct for visiting fielders but incorrect for fielders who “learn” the wall (learning might mean that the Red Sox organization knows the best way to position them, a way that makes them look bad by UZR).
[44] I’m not sure I get that. UZR doesn’t care how you play the caroms. It’s all about whether you catch the ball. Are you suggesting that the Red Sox intentionally position their LFers so as to concede hits on catchable fly balls (the ones that hit low off the wall) and minimize extra bases (on balls that hit high off the wall)? And further, that this isn’t an obvious enough adjustment for visiting teams to also pick up on?
I was looking forward to seeing what UZR/ZR numbers a purported solid CFer (Cameron) would have put up in LF in Fenway. On top of it being a weird place to play, you had 2 pretty bad LFers out there (Ramirez and Bay) for a while now, so it would have be nice to see what a solid OFer would have looked like playing a season there. Did we lose that chance with Ellsbury moving over there?
Are you suggesting that the Red Sox intentionally position their LFers so as to concede hits on catchable fly balls (the ones that hit low off the wall) and minimize extra bases (on balls that hit high off the wall)? And further, that this isn’t an obvious enough adjustment for visiting teams to also pick up on?
yep.
It would be interesting to compute the UZR of a CFer as if he was a SS, just to see what number you would get. I’m not sure how UZR behaves if you are barely/significantly/massively out of position.
Are you suggesting that the Red Sox intentionally position their LFers so as to concede hits on catchable fly balls (the ones that hit low off the wall) and minimize extra bases (on balls that hit high off the wall)?
I’m not say that’s what [44] is suggesting, but if it is, I don’t recall it that way. If memory serves, the Red Sox basically have the LF go back to the wall and try to make a play on it, and the CF slides over to get the carom, while the SS comes way out to LF to and is really moving around out there to make the cut off a straight shot to 2nd or 3rd base. This happens unless its some screaming line drive that is definitely going off the wall, then the LF positions himself for the carom, and you end up seeing a lot would-be doubles played into singles by a LFer who knows what he’s doing out there.
Here is the link, actually a place called “basement dwellers”. I think this article is pre-FanGraphs, or at least pre it being a “known” entity. Certainly pre UZR being available on it. Also, by this, FSR correlates best with PMR (which I don’t know much about, unfortunately).
[42 & 43] True, there could be a confirmation bias there. But as I noted this article seems old enough that it predates UZR being readily available. Also, I don’t know a lot about Tom Tango’s FSR (remember, this is *not* the same as the Fan’s Projections on FanGraphs), so I don’t know how large a sample of people there are, and/or if it is a true representation of the, “average baseball fan”. It’s possible there is a lot of bias (Adam Everett has a good UZR, therefore Adam Everett is a good fielder), or there could be little/no bias (Everett is a great fielder, whatever UZR may say! What it agrees with me?) It’s the closest *I* have to comparing what UZR sees vs what “people” see.
[45] Building on J, I suppose Red Sox coaches could teach their LF to “play it safe”. That is, if it looks like a ball is going to go off the wall, play for the carom, hold to a single (or double). Which would reduce their plays-made, but visiting LF go “all out”, resulting in a few more outs, but more extra bases on balls they can’t get to. I don’t believe it is true, especially as J points out the CF and SS can help with the caroms. But sure, it’s possible.
It would be interesting to compute the UZR of a CFer as if he was a SS, just to see what number you would get. I’m not sure how UZR behaves if you are barely/significantly/massively out of position.
Not sure what you mean by this. “Massively out of position” could have two meanings - David Ortiz playing CF in Detroit is, “massively out of position”. ARod with a righty batter (no shift) playing directly behind the pitcher is also “massively out of position”. So…?
[48] You are probably right. If I am right then there is probably a systematic problem with UZR, one that doesn’t really show up except in Fenway. Funnily enough, the only ways I’ve been able to bias UZR (based on the BBTF articles) in my head have favored the Fenway LFer and punished everyone else.
Massively out of position means pretending that the guy standing in CF is a LF, so you compare his outs in each zone to the LFers. He will have lots of outs in zones that LFers hardly ever get to, but very few outs in normal places. (It’s a way to see how the formula behaves abstractly, it doesn’t mean anything directly.)
SG, have I told you how much I admire your work? This is truly exemplary stuff, far superior to those other metrics like the Yankees’ batting order nonsense. I’ll have more, please.
Did we lose that chance with Ellsbury moving over there?
Are you suggesting Ells isn’t solid in left? I think this is an open, fair question, but to argue in his favor, he has 51 starts in left, and I believe his LF stats are better than his CF ones.
[54] Well, all I’ve read is that his defense was suspect in CF by metrics and visibly. I guess what I’m saying is, I’d like to see the UZR of a full time Fenway LF who has established himself as a solid OF (LF or CF I guess) someplace else. Cameron fits that description, but Ellsbury doesn’t.
Although to fan the flame, had Cameron showed up in Boston to play LF, and he sucked, we’d probably be left wondering if it’s because of the age or because of Fenway. Drat.
[36] Thanks, Mike.
SG- I’m curious - what software do you use to do your analyses?
Although to fan the flame, had Cameron showed up in Boston to play LF, and he sucked, we’d probably be left wondering if it’s because of the age or because of Fenway. Drat.
But when he now falls off a cliff in CF will it be because of age or that weird triangle area in Fenway?
There is no offseason for Kevin Long.
[59] Are we talking metaphorical cliff or like Wile E. Coyote/Road Runner actual cliff?
I’m less interested in how Cameron performs in the future defensively and more interested on what a capable OFer looks like when he plays two full seasons in LF in Fenway.
[61] Metaphorical, although I wouldn’t mind Youklis or Hemp Necklace falling off an actual cliff. In fact, I just ordered an Acme brand fake tunnel to put in front of said cliff just before they run by.
The fake tunnel is to get them to run into a wall. For the cliff, you’ll need to sell them on some kind of hare-brained flying scheme.
[63] Maybe give them a pair of wings and put an umpire making calls against them on the other side of a canyon?
Can’t read about the Wile E Coyote cartoons without thinking about Seth MacFarland’s version.
A poem about WEC for fans of Greek literature.
SG- I’m curious - what software do you use to do your analyses?
90% of it is just done in Excel, but I do have some MySQL databases for stuff like play by play data and Pitch FX.
Anyone else catch this gem from Papelbon re: his contract reaching $10M?:
“Heck yeah, as far as what me and my brain are thinking, but I haven’t even sat down with my agents yet.”
Translation: “I’d love to finish talking about this conversation, but I have to go pee.”
That’s the same interview where he said that he learned JUST THEN, through THAT VERY INTERVIEW, (which was, what, two days ago?), that the Red Sox had signed Mike Cameron and Adrian Beltre and traded Casey Kotchman.
Wow.
I happened to notice a link to “Elect Moose” under the “Yankee Blogs” section. I saw the most extraordinary thing; the blogger has managed to communicate back in time! Would have to be, b/c I can’t understand how any person that has followed baseball in any year would have the arguments against Mussina going into the HOF, that this guy has to respond to.
So netting out park factors, the Yanks and Sox are about even on RA.
And the Sox defense projects to save 30 runs better than the Yanks (which is average).
So this means that the Yanks pitching is projected to be 30 runs better than the Sox, which I find hard to believe. I’d love to see the Sox pitching numbers.
Next entry: MLB.com: Long already hard at work with Yankees
Previous entry: 2009 vs. 2008 Yankees by Position (Defense Edition)
There are currently 30 visitors who are not logged in.
There was a record 320 simultaneous visitors on October 23, 2012 at 5:17:14 pm.












