Wednesday, October 7, 2009
How Good Are the 2009 ALDS Version of the Yankees On a Spreadsheet?
Since this is a Yankee site, it's probably time to run the Yankees through the same numbers that I ran the Tigers and Twins through.The idea here is that using just 2009 data and stats for the entire team doesn't really do a good job of telling how good a team as currently constituted really is. Does it matter that Cody Ransom, Angel Berroa, Chien-Ming Wang and Anthony Claggett combined to be be 51 runs worse than replacement level when trying to assess how good the Yankees are right now? Obviously not, just like it doesn't matter how awful John Smoltz and Brad Penny where when assessing how good the Red Sox are. The same holds for every other team in the postseason. This is why using Pythagorean record or actual winning percentage doesn't really give us that much useful information.
So let's look at the Yankees' postseason roster and their projections and try to figure out how good they really are right now. First up, the position players.
| Lineup | Pos | PA | AVG | OBP | SLG | pwOBA | 09wOBA | Diff | BR | Outs | dRS |
| derek jeter | ss | 23 | .314 | .382 | .440 | .364 | .384 | .020 | 3 | 14 | -3 |
| johnny damon | lf | 22 | .282 | .357 | .453 | .354 | .369 | .015 | 3 | 14 | 0 |
| mark teixeira | 1b | 22 | .292 | .386 | .542 | .397 | .402 | .005 | 4 | 14 | 4 |
| alex rodriguez | 3b | 21 | .293 | .399 | .546 | .405 | .402 | -.003 | 4 | 13 | -4 |
| hideki matsui | dh | 21 | .279 | .363 | .474 | .363 | .375 | .012 | 3 | 13 | 0 |
| jorge posada | c | 16 | .288 | .378 | .485 | .376 | .377 | .002 | 2 | 10 | -6 |
| robinson cano | 2b | 19 | .305 | .339 | .485 | .354 | .372 | .018 | 3 | 13 | -1 |
| nick swisher | rf | 19 | .243 | .359 | .461 | .357 | .373 | .016 | 3 | 12 | 2 |
| melky cabrera | cf | 10 | .270 | .328 | .394 | .318 | .328 | .010 | 1 | 7 | 0 |
| brett gardner | cf | 10 | .257 | .338 | .342 | .310 | .318 | .008 | 1 | 7 | 13 |
| eric hinske | rf | 4 | .238 | .330 | .449 | .339 | .347 | .008 | 1 | 3 | -1 |
| jose molina | c | 6 | .226 | .270 | .319 | .261 | .259 | -.001 | 0 | 4 | 1 |
| jerry hairston | ss | 2 | .266 | .317 | .404 | .314 | .302 | -.012 | 0 | 1 | -3 |
| ramiro pena | ss | 1 | .240 | .287 | .304 | .267 | .306 | .039 | 0 | 1 | 5 |
| total | 196 | .285 | .366 | .476 | 29 | 125 | 0 |
PA: Estimated plate appearances for the series, assuming it goes the distance.
pwOBA: projected weighted on-base average, a rate version of linear weights
09wOBA: 2009 actual wOBA
Diff: 09wOBA minus pwOBA. As a rough rule of thumb, a difference of .010 in wOBA is worth about five runs over 600 PAs.
BR: Estimated batting runs for the series using linear weights for estimated PA
Outs: Estimated outs for the series based on revised projection and estimated PA
dRS: Defensive projection over 150 games using an average of zone rating and UZR for non-catchers, and for 120 games using a system similar to the one described here for catchers.
As you can see from these numbers, just about every Yankee player had better numbers in '09 than they'd be projected to have going forward. That may seem harsh, but it doesn't change the fact that this is a very good team, probably the best team in the postseason right now.
Derek Jeter had an outstanding 2009. Unfortunately, he didn't have an outstanding 2008 and we can't ignore that. Still, he's probably the best leadoff hitter on any of the playoff teams, although you can possibly make a case for Chone Figgins if you squint a little. Jeter even played passable defense this year, and at this point the defensive metrics project him to be just a touch below average. Of course, these metrics were infallible when they showed Jeter at -20, but now that they show him as decent we need to ensure we are aware of their limitations.
Johnny Damon also had a very good 2009, although he hit .215/.319/.278 over his last 92 PAs. A free agent in 2010, Damon's probably auditioning for his next contract, and a good postseason probably earns him a few extra million. His glove and arm in LF have left something to be desired this year, although his projected range is about average. The arm though...
Mark Teixeira had a great year, although he's not really a worth MVP candidate. He's a switch-hitter who has generally had a higher OBP versus lefties and a little more power against righties (.282/.373/.579 career vs. righties and .305/.400/.511 vs. lefties). Despite what 2009 UZR says, Teixeira's glove projects as a slight positive once you factor in past performance.
Until he actually does well in the postseason, the spotlight is going to be on Alex Rodriguez. We know he has the ability to carry the team if he gets hot, and we also know we're going to keep reading about if if he doesn't do well.
Hideki Matsui may be nearing the end of his Yankee career. If so, he had a fine last season and could really punctuate it with a good postseason. Matsui has hit .302/.372/.506 in five postseasons with the Yanks, although he's had a couple of clunkers in there as well.
I guess this is where I'm supposed to flip out about Jose Molina being penciled in to catch A.J. Burnett since it will take Jorge Posada's bat out of the lineup. The thing is, I can't seem to really get that worked up about it. I wouldn't try to deny that there's something to a pitcher-catcher relationship that we can't quantify, and I'd also imagine that Molina would only bat 2-3 times in a game anyway. If you end up with Molina batting in a high-leverage situation after the fifth inning, is there any doubt that he'll be pinch-hit for? Anyway, this paragraph is supposed to be about Jorge Posada. I've adjusted the playing time in the table to assume Molina starts twice. Posada's generally had poor postseasons in his career, but I'm sure fatigue was an issue. He got plenty of rest this year, so despite his advanced age I think he's primed for a good postseason. Maybe the Molina thing will give him an extra kick in the ass too...
Robinson Cano rebounded from a dismal 2008 to have a nice offensive year, although his performance with runners on base was pretty bad. Whether that's due to a change in approach or just due to the vagaries of a selected set of PAs, we don't know. Cano seemed to have a good defensive year, so I'm having trouble reconciling the fact that zone rating and UZR saw him as a touch below average.
Nick Swisher and Melky Cabrera round out the probable starters in RF and CF respectively, then you have the bench of Brett Gardner, Eric Hinske, the aforementioned Molina, Jerry Hairston and probably Freddy Guzman. I gave Gardner 10 PAs but he probably won't start if Girardi wants to keep him as a tactical option for baserunning and as a defensive replacement, but no matter how you allocate the playing time between Gardner and Melky the difference is probably negligible over five games.
The Yankee offense is probably the best one in the postseason once you adjust for park and league. The defense looks around average too, which is very unusual for the Yankees.
So, the pitching...
| Pitcher | Role | IP | H | HR | BB | K | pRA | pERA | pFIP | 09ERA | 09FIP | sIP | sR |
| cc sabathia | SP1 | 229 | 213 | 20 | 63 | 198 | 3.84 | 3.44 | 3.42 | 3.37 | 3.38 | 12 | 5.1 |
| a.j. burnett | SP2 | 197 | 186 | 22 | 71 | 185 | 4.42 | 4.06 | 3.84 | 4.04 | 4.29 | 12 | 5.9 |
| andy pettitte | SP3 | 205 | 219 | 21 | 68 | 149 | 4.82 | 4.35 | 4.05 | 4.16 | 4.19 | 6 | 3.2 |
| mariano rivera | CL | 71 | 55 | 5 | 12 | 69 | 2.32 | 2.18 | 2.72 | 1.76 | 2.94 | 3 | 0.8 |
| phil hughes | SU | 99 | 95 | 10 | 33 | 86 | 4.34 | 4.08 | 3.70 | 3.03 | 3.15 | 3 | 1.4 |
| david robertson | SU | 68 | 60 | 5 | 22 | 69 | 3.70 | 3.43 | 3.05 | 3.30 | 3.09 | 3 | 1.2 |
| alfredo aceves | MR | 140 | 120 | 13 | 22 | 98 | 3.91 | 3.63 | 3.46 | 3.54 | 3.68 | 2 | 0.9 |
| joba chamberlain | MR | 131 | 114 | 10 | 54 | 146 | 3.52 | 3.18 | 3.16 | 4.75 | 4.69 | 2 | 0.8 |
| damaso marte | MR | 43 | 39 | 4 | 19 | 41 | 5.02 | 4.68 | 3.97 | 9.47 | 5.53 | 1 | 0.6 |
| phil coke | MR | 61 | 63 | 6 | 15 | 46 | 4.73 | 4.41 | 3.62 | 4.50 | 4.73 | 1 | 0.5 |
| chad gaudin | MR | 102 | 104 | 13 | 42 | 79 | 4.82 | 4.45 | 4.54 | 3.43 | 5.18 | 0 | 0.0 |
| mark melancon | LR | 62 | 62 | 8 | 8 | 37 | 5.11 | 4.70 | 3.95 | 3.87 | 3.81 | 0 | 0.0 |
| Total | 45 | 42 | 4 | 14 | 40 | 4.08 | 3.73 | 3.57 | 20.4 |
The assumptions here are that C.C. and A.J. get two starts each, and that Joba Chamberlain is in the bullpen.
The only starting pitcher in the postseason with a better projection than Sabathia is Chris Carpenter, although you can probably make a case that Jon Lester's projection contains data that isn't very relevant to him anymore from when he was fighting cancer. Critics point to the fact that Sabathia has had a bad postseason track record (in a whopping five starts), but I generally weigh 288 starts of good to great quality more than five starts. THat's just me though.
A.J. Burnett is the wild card for the Yankees. He could pitch a gem or he could be torched in any start against any opponent. Let's hope for gems.
Andy Pettitte had a very solid season in 2009 and slots in comfortably in the #3 slot. I thought that pitching him in game 3 on turf was a bad idea, but looking at his three year splits he's actually been better on turf.
As far as the pen. You've got Mo, who continues to excel even though he's lost a tick off his velocity. Phil Hughes was the most effective setup reliever in the American League. David Robertson brings the stuff to get a big strikeout when needed, Alfredo Aceves can come in and give you 3-4 innings if you need it. I have no idea what Joba will do, but we all know about his ability. Note that Hughes and Chamberlain's projections are as relievers. Phil Coke and Damaso Marte give the Yankees a couple of lefties with decent stuff to matchup with, which will be important against Mauer, Kubel and Span. Both are not without flaws though. I threw Chad Gaudin's projection up but didn't give him any innings. He could be very useful out of the pen, especially if spotted against righties.
The Yankee staff has the second best projected strikeout rate of any of the teams in the postseason, just a hair behind Boston (7.96 to 8.05). They have the third best projected walk rate and third best projected HR rate (just look at the Twins post for all the rankings).
So what does all this tell us?
| #games | 5 |
| home games | 3 |
| #outs | 125 |
| offense | 28.6 |
| pitching | 20.4 |
| defense | 0.0 |
| wpct | .660 |
| 162 gm equiv | 107-55 |
Standard disclaimer about the inherent limitations of projections goes here. Player talent can change in ways that objective projection systems won't pick up on, so nothing here is absolute.
The ALDS version of the Yankees looks like a team that would win close to 2/3 of their games. So running the Twins vs. Yankees 10,000 times on my Monte Carlo playoff simulator, I get these ALDS odds:
Yankees: 79.5%
Twins: 20.5%
The Twins can beat the Yankees, and they might. But the odds are pretty long against it.
Update: Yankees.com has the ALDS roster up.
Comments
I’m hoping for at least a decent ARod postseason, even if they lose. I just can’t read that shit anymore.. it’s too annoying.
[1] If A-Rod does well today…“It’s only Game 1 of the ALDS. Let’s see him do it in Game 7 of the World Series.”
How is the Monte Carlo simulation set up? Is that 10,000 runs of a 5 game series to see which team wins 3 or more?
107 win team. Holy shnikes.
[2] If the Yankees sweep every game in the postseason on the back of outrageous ARod performances: “Let’s see him do it in Game 7 of the World Series.”
[4] Haha, exactly. “They were already cruising through the playoffs so there was no pressure. That’s when A-Rod performs best.”
Eleven pitchers seems strange given that they picked the extra off day. MLB.com also says that Pavano gets the start in game 3, which seems stranger given that Baker would be on regular rest and is probably their best starting pitcher.
imagine that Molina would only bat 2-3 times in a game anyway
I’ll repeat it, but I’m hoping he’ll bat 6 or so times per game.
11 pitchers seems prelude to over managing the bullpen in the 7th inning, or worrying that Burnett will burn down the barn in Game 2.
[6 & 8] I mentioned it on the other thread, but the pre-roster thinking was Twins have several lefties so two are desireable. I think it is possible that Marte, Coke, or both right now are looked on mainly as LOOGY’s. So if you are only counting on them for 1 batter each sure I could see a 6th inning or something of Coke, Gaudin, Marte, followed by Robertson in the 7th. Seems like one too many, but going with the extra lefty or the extra PR…if that is the reason Yanks win or lose the series they have themselves to blame anyway.
[7] - Hoping for the sac-bunt and grand slam in the same inning again?
Cano seemed to have a good defensive year, so I’m having trouble reconciling the fact that zone rating and UZR saw him as a touch below average.
SG, did you track ZR data weekly (or more often) like you did last year? If so, can you see anything that jumps out as a reason? For example, I think last year Cano was basically average for the year, EXCEPT for a two-week period where he lost like 6-8 runs in value because he missed everything. Wondering if this year was similar; he’s generally been average or better like our eyes tell us, but if there were any times were a few bad games cost him a lot of value.
FWIW, FanGraphs the other day someone did a (very basic) analysis of UZR, and it comes out roughly as reliable as wOBA in year-to-year correlation. Most of the arguments on the site were about the stats used (like should have used r instead of r^2) which I don’t completely understand. But Tango posted and basically told everyone math was simple but essentially right. Some guy argued even if the numbers were correct that they were equally reliable, wOBA’s misses were normal fluctuations, while UZR’s misses were a problem with UZR. I found that interesting…
Isn’t UZR also based on league average? Could the defensive skill of the AL 2B as a group dramatically risen this year?
I find I’m so excited, I can barely sit still or hold a thought in my head. I think it’s the excitement only a free man can feel, a free man at the start of a long journey whose conclusion is uncertain. I hope I can make it across the border. I hope to see my friend, and shake his hand. I hope the Pacific is as blue as it has been in my dreams. I hope.
I’m actually not thrilled that Bruney was kept off the roster. I know he’s been terrible for most of the second half, but his last 3 outings I saw definite flashes of the stuff he had at the beginning of the season when he was lights out.
If Marte gets into any high lev situations, he better deliver.
I can see wanting two lefties for the Twins. I think I even mentioned it in an earlier thread. I don’t see taking two lefties and also taking Chamberlain and Gaudin when you’re only going to play on consecutive days once in a week.
I’m not all that happy with the 3 catcher roster. Especially for a short series. Would have liked to see Bruney as well.
[13] with the Pacific part, I’m assuming Shaw-whatever Redemption.
[14] We go with what we got, no looking back now.
How is the Monte Carlo simulation set up? Is that 10,000 runs of a 5 game series to see which team wins 3 or more?
It’s actually set up to run the whole postseason. You plug in the eight teams as seeded and an estimated winning percentage for each team, then add in a random number to that winning percentage to account for the volatility of how a team may play over five games, I’m using -.040 to +.040 for that. So in some simulations the Yankees may look like a .700 team and in others they may look like a .620 team. It runs the playoffs as many times as you tell it and it records the results. It only simulates games 4, 5, 6 and 7 (where applicable) if needed.
SG, what does it say the Yankees percentage to win the WS is? I’d guess somewhere around 25%.
Ah, much more elegant that my situation. I was coding in my head how I would do something like that last night trying to fall asleep, but 8 years since my last CS job have left my recursive skills a little rusty.
So when it does the whole playoffs, does it take into account rotations and schedules? That would make it fairly complicated, and I’m not sure how much better of an answer you would get.
Anyways, thanks again as always. Just fantastic stuff. Keeps my head straight through all these chemicals.
In other news, apparently Don is a headline writer for the Associated Press:
Stocks Fall After 2-Day Rally As Earnings Loom
SG: Great stuff. I know it’s a Yankees site and all, but are you planning to run the numbers on the Angels-RS series? I imagine that one is much closer to a 50-50 proposition.
Also, this comment: Interesting how in a 5-game series, a team’s top three starters and top 2 relievers can account for 80% of the innings pitched.
[17] Yeah, definitely Shawshank. Excellent movie, but I haven’t convinced myself to watch it a second time yet…
SG, what does it say the Yankees percentage to win the WS is? I’d guess somewhere around 25%.
The Spreadsheet East River Series you mean?
Of course you don’t have to flip out about the Molina thing, SG. I did it for you.
“The Wildcard” better pitch well. Very well.
The more I think on the Molina business, the more I come around to SG’s “meh” take. You need to do everything possible to maximize your starters’ chances of pitching well and deep into the game.
I don’t at all look forward to: Coke-faces-Mauer/Gaudin-faces-Cuddyer/Marte-faces-Kubel.
Man, my productivity is going to be at an all-time low since I’ve started this job. I wanted to get some money down on the Yanks, but I don’t think I have time to get to the sports book before first pitch. They’re probably going to be like -300 anyway.
The thing about the last few roster spots is…well it hardly matters. The way the Yankees are constructed, plus the extra days off, leads me to believe we’re unlikely to need #24 and #25.
11 pitchers seems excessive, but carrying Guzman also seems excessive when Gardner and Hairston are available to pinch run.
Carrying Cervelli doesn’t offend me. It’s plausible that you could see a pinchrunner for Jorge in a tie game in the 8th. Then in the 10th, Molina’s spot is up with the winning run in scoring position. Having Cervelli allows you to put Hinske at the plate in that situation.
Does Gardner or Melky start?
[29] Word on the street is that Melky is starting.
You guys might find this amusing. I’m in job-hunting mode, and a friend of mine at a large company told me to put an “interests” section in my resume for that company. So in the phone interview I had with them to set up an on-site visit, after talking about my qualifications, the HR person said “Now, I just have one more question about your resume. I actually had to look this up on Wikipedia, but could you tell me about your interest in ‘sabremetrics’?” We then spent 5 minutes discussing the playoffs. She’s a Phillies fan and got to go to “both Game 5’s” last year.
[29 & 30] Yeah, it sounds like Melky is starting. Especially tonight. I’m sure as the playoffs go on Girardi will be receptive to changing that. Probably not as fast as some think he should, but…I doubt if Melky goes 0-12 in the first round and Gardner goes 3-4 with a BB, that Melky would be starting Game 1 of the ALCS.
Lohud quotes Gardner as saying he doesn’t expect to start in the PS.
“I hope the Pacific is as blue as it has been in my dreams.”
A great quote. Personally, I feel like I’m about to start a race or something. And I’m just sitting on my ass watching!
SG, did you track ZR data weekly (or more often) like you did last year?
[11] I did, but I haven’t pulled it all together yet. Going off memory, I think Cano’s defense was fairly consistent statistically all year.
Hope you get the job, DaPuj. And soon - at least before the Yankees beat the Phillies in the WS in a few weeks.
[35] Thanks. I’m sure it will appear in a post-season wrapup on the players.
but could you tell me about your interest in ‘sabremetrics’?
Also can you tell me how you get such good phone reception in your mom’s basement?
=P
Amusing predictions at SI:
http://sportsillustrated.cnn.com/2009/baseball/mlb/10/07/al.experts.picks.divisional/index.html
100% for the Yankees, 7-5 in favor of the Sox. Though at least two of them (including Heyman) cite history as a factor. I for one am very excited about the prospect of history throwing out some Angel basestealers. That really should put us over the top.
I think we’re forgetting the most important thing: it’s Rocktober.
On the Twins ownership mentioned in the previous post, their owner had I believe two sons (dont know their names, so lets call them Cain and Able). The rep on them is they are also cheap (they are looking to turn a profit and deposit checks from the Yankees into their personal bank accounts). However, they supposedly will be selling the team in the near future. This info was conveyed to me by a Twins fan, but I have also had Twins fans tell me Jeter makes more than the entire Twins roster each year.
As to why they (Twins fans) can stand a cheapskate owner, it allows them a sense of smugness that they can win as the underdog who overcomes all odds. Also, they dont realize how good Terry Ryan was at his job and that all the new guy has done is unload Johan for not much of a return and trade Garza/Bartlett for Delmon.
The announcer feed audio on TBS is kind of muffled. I’m not complaining.
if the winds are as crazy as reported, wouldn’t gardner be a better start in terms of defense?
@43. are you forgetting DAVE ROBERTS!!!111 We need GARDONER OFF DA bench!@!
So, think Jimenez is a little jacked up for this game? He’s hitting 101 according to gameday. I know he throws hard, but damn.
100% for the Yankees, 7-5 in favor of the Sox. Though at least two of them (including Heyman) cite history as a factor. I for one am very excited about the prospect of history throwing out some Angel basestealers. That really should put us over the top.
I’ve got the Sox around 59% so 7-5 makes sense although I still have to finalize the rotations and depth charts. Not using ‘history’, just using the teams as they currently look.
More awful umpiring calls in Philly. How do some of these guys still have a job, let alone postseason games?
Neyer linked to this about how Jeter improved his step to his left. I don’t think anyone else has linked to this here.
http://www.northjersey.com/sports/OConnor_How_Jeter_got_younger_at_age_35.html
that’s a great piece about jeter jay. thanks.
[48] Best quote from the article:
But this year, even the sabermetric scholars forever ranking Jeter among baseball’s weakest shortstops had to admit he was beating them at their own game.
As to why they (Twins fans) can stand a cheapskate owner, it allows them a sense of smugness that they can win as the underdog who overcomes all odds.
The other thing to remember about the Twins late cheapskate owner is that he saved the franchise for the Twin Cities. Sure he was perfectly willing to go along with contraction for a hefty buyout a few years later, but people’s memories are selective as well as short.
Also, let’s hope that Cain Pohlad doesn’t slay Abel Pohlad while they’re negotiating the sale of the team.
I know it’s a Yankees site and all, but are you planning to run the numbers on the Angels-RS series? I imagine that one is much closer to a 50-50 proposition.
BD, I missed this post before, but hopefully you got what you were asking for.
Next entry: 2009 NLDS Preview: Rockies vs. Phillies
Previous entry: Twins or Tigers (Twins Edition)?
There are currently 61 visitors who are not logged in.
There was a record 241 simultaneous visitors on May 2, 2011 at 11:54:25 pm.












