The Curse of Jerry Hairston, Jr./Eric Hinske:
 

Friday, April 10, 2009

Revised CAIRO Playoff Odds through games of April 9

While tooling around the internets, I found a cool Monte Carlo simulator spreadsheet for the baseball season at a site called xlsSports. I've modified it to import the current standings and then run the season going forward, and I've set it up to use a weighted average of YTD and 2009 projections to figure out the strength of the teams. I've also modified the basic Pythagorean theorem formula it uses to the more accurate PythagenPat formula. Both of those formulas use a team's runs scored and runs allowed to determine the strength of the team and calculate it's winning percentage going forward.

Anyway, what this will let me do is run updated playoff odds for the six projection systems I used in the Diamond Mind Projection Blowout, as well as with the combined projections whenever I feel like it. I'll create a page where I will keep these updated, but for now here's a sneak peak at the CAIRO version, run 10,000 times.

System cairo
Div Team W L RF RA Div% WC% PO% Max Min
ALE TAM 94.2 67.8 804 695 38.9% 29.1% 68.0% 121 67
ALE NYA 93.7 68.3 867 724 35.1% 29.8% 65.0% 122 62
ALE BOS 92.1 69.9 843 739 25.4% 28.6% 54.0% 118 67
ALE TOR 77.4 84.6 690 717 0.5% 1.6% 2.0% 105 50
ALE BAL 73.5 88.5 801 870 0.1% 0.4% 0.5% 100 48
Div Team W L RF RA Div WC PO% Max Min
ALC CLE 84.4 77.6 810 808 38.6% 1.3% 39.9% 112 53
ALC DET 84.2 77.8 774 764 37.4% 1.2% 38.6% 110 56
ALC MIN 79.2 82.8 718 748 13.7% 0.6% 14.3% 106 52
ALC KC 76.2 85.8 717 835 6.5% 0.3% 6.8% 103 50
ALC CHA 74.1 87.9 739 782 3.7% 0.2% 3.8% 102 46
Div Team W L RF RA Div WC PO% Max Min
ALW LAA 86.2 75.8 768 729 41.9% 2.5% 44.4% 112 57
ALW OAK 85.4 76.6 767 752 35.7% 2.6% 38.3% 110 58
ALW SEA 81.7 80.3 721 728 17.6% 1.6% 19.2% 114 52
ALW TEX 76.2 85.8 820 881 4.7% 0.4% 5.1% 103 47
Div Team W L RF RA Div WC PO% Max Min
NLE NYN 91.5 70.5 842 778 42.4% 16.3% 58.7% 118 62
NLE ATL 91.3 70.7 800 730 40.7% 16.3% 57.1% 119 63
NLE PHI 86.0 76.0 834 798 13.9% 10.4% 24.3% 113 54
NLE FLA 79.3 82.7 777 836 2.6% 2.4% 5.0% 105 53
NLE PIT 74.2 87.8 799 903 0.3% 0.5% 0.8% 102 50
Div Team W L RF RA Div WC PO% Max Min
NLC CHN 96.3 65.7 845 730 77.2% 7.7% 85.0% 124 69
NLC STL 86.7 75.3 797 745 13.8% 14.6% 28.4% 113 58
NLC MIL 82.9 79.1 780 784 5.6% 7.0% 12.6% 112 55
NLC CIN 80.9 81.1 738 781 3.1% 4.0% 7.2% 111 53
NLC HOU 72.8 89.2 740 829 0.2% 0.3% 0.5% 100 47
NLC WAS 70.8 91.2 763 885 0.1% 0.1% 0.2% 97 42
Div Team W L RF RA Div WC PO% Max Min
NLC LAN 90.7 71.3 818 761 56.1% 5.9% 62.0% 118 63
NLC SF 84.7 77.3 764 746 18.0% 5.6% 23.6% 112 56
NLC COL 83.2 78.8 841 822 13.1% 4.4% 17.5% 111 58
NLC ARI 82.5 79.5 739 724 11.0% 3.8% 14.8% 111 57
NLC SD 75.8 86.2 729 820 1.8% 0.6% 2.4% 101 49


RF: Runs for
RA: Runs against
Div%: Percentage of times the team won their division
WC%: Percentage of times the team won the wild card
PO%: Playoff % (Div% + WC%)
Max: High win total
Min: Low win total

One note, this is a blatant ripoff of Baseball Prospectus's various Playoff Odds Reports, except that I know what the input data is so I'm more comfortable with it. If anyone sees anything that doesn't look right, let me know.
--Posted at 9:00 am by SG / 31 Comments | - (188)

Comments

Page 1 of 1 pages:

Color me confused.  I thought the preseason projected order of finish for the AL East was NYY, Bos, TB.  Now, after just three games, it’s TB, NY, Bos?

Now, after just three games, it’s TB, NY, Bos?

I think the teams are so evenly matched that games in hand are going to be pretty significant, although maybe I’m overweighing 2009. I’m using the formula of:

Total 2009 MLB games played to date divided by 2430 times the teams’ actual runs scored and runs allowed to date pro-rated to 162 games plus 2430 minus 2009 MLB games played to date times the teams’ projected runs scored and runs allowed for their revised runs scored/allowed.

Tampa taking 2 out of 3 from Boston is probably non-trivial to both teams’ playoff chances in this type of methodology.  Whether that’s true or not, I’m not really sure.

Looking at B Pro’s various odds they seem to have swung pretty big as well.  They had the Yankees projected to go 99-63 before the season started and now their PECOTA playoff odds has them at 95 wins.

Then again, it just might be too early to run this type of thing, or at least take it too seriously.

Eh, they were all pretty close to start.  I could see TB winning 2 of 3 against BOS and the Yanks losing 2 of 3 to BAL could swap some stuff around.  TB got three tough games (on the road) out of the way, the Yanks lost to theoretically the weakest team in the division.

jI could see TB winning 2 of 3 against BOS and the Yanks losing 2 of 3 to BAL could swap some stuff around.  TB got three tough games (on the road) out of the way, the Yanks lost to theoretically the weakest team in the division.

Right.  At this point, log5 would have said Boston and the Yankees should be 2-1, and Tampa Bay should be 1-2.  So you’ve got a 1 game swing on all three teams already.

i thought we could just extrapolate.  you mean the Yankees might not go 54-108 this year?

I think what’s troubling is that a pre-season projection of, say, 98 wins incorporates the assumption that the team will, at various points in the season, lose two out of three.  I don’t recall anyone saying two weeks ago, “the Yankees are projected to win 98 games, but if they begin the season 1-2 (outscoring their opponent 21-18 over those 3 games), then they are only a 93-win team.”  That makes no intuitive sense.  I sense that we’re mixing apples and oranges here.

you mean the Yankees might not go 54-108 this year?

Apparently not, although 62-100 is still in play.

I don’t recall anyone saying two weeks ago, “the Yankees are projected to win 98 games, but if they begin the season 1-2 (outscoring their opponent 21-18 over those 3 games), then they are only a 93-win team.” That makes no intuitive sense.  I sense that we’re mixing apples and oranges here.

And if the Yankees were projected to win 98 games, I’d agree.  But they weren’t, they were projected to win 96 by CAIRO.  So their going forward projection assuming 2009 tells us nothing about their ability would be 96/162 times the 159 remaining games, 94-65.  1-2 + 94-65 = 95-67.  The Monte Carlo spreadsheet includes a higher standard deviation than my Diamond projections btw, to account for the greater volatility in a season, which is probably where 95-67 becomes 93.7-68.3.

isn’t this just Bayes’ Theorem (it’s been 10 years, cut me some slack)?

isn’t this just Bayes’ Theorem (it’s been 10 years, cut me some slack)?

Possibly, but if it were just the Bayes theorem, then the volatility (standard deviation) of the posterior distribution would decrease with additional data (assuming the predictions that we started with were the prior distributions). I don’t necessarily see that, but I could be missing something.

Any system that has every team (except the Nationals) winning 100+ games at least once should probably be promptly ignored.

How much playing time was attributed to David Price?

There are 10,000 seasons.  I’m sure with that many any number of fluky things can happen and pretty much any team could win over 100.  That’s almost 2 orders of magnitude more games than have been played real history.

I didn’t get to comment yesterday on Swisher when the comparisons were made with Giambi, but with his eyeblack on and his squinty eyes, he sort of looks to me like Don Mattingly. Hope he ends up hitting like the pre 1990 version. 

The articles the first few days by Klapish on how Tex and CC were a waste of money are now followed up by an article that praises Burnett for his guts.
What a ridiculous thing to say. Why would a player give a guy an interview after such BS?

Possibly, but if it were just the Bayes theorem, then the volatility (standard deviation) of the posterior distribution would decrease with additional data (assuming the predictions that we started with were the prior distributions). I don’t necessarily see that, but I could be missing something.

According to this article by Keith Woolner, the standard deviation for team wins even if we have perfect information (ie, we nail every single player’s projection and playing time) is 6.3.  When I run the projections through Diamond Mind, I get a standard deviation around 6.7, which is not really high enough given the fact that we don’t have perfect information.  Ideally, team wins should have a standard deviation in the 8-9 area.  This Monte Carlo simulator accounts for that by randomly modifying the teams’ talent level slightly during each iteration.  So in some iterations, the Yankees are a 105 win team, in some they are an 80 win team, etc.,  That’s why you’re seeing a higher standard deviation than we saw in the Diamond Mind projections.

Any system that has every team (except the Nationals) winning 100+ games at least once should probably be promptly ignored.

Anyone is welcome to ignore anything I post.  However, it’s a statistical fact that at this point in the season, just about every team has a non-zero chance to win 100 games.  Obviously as we get farther into the season, most teams will regress towards their expected talent level and you’ll see fewer 100 win/100 loss teams.

How much playing time was attributed to David Price?

Around 140 innings.

There are 10,000 seasons.  I’m sure with that many any number of fluky things can happen and pretty much any team could win over 100.  That’s almost 2 orders of magnitude more games than have been played real history.

Exactly.

Why would a player give a guy an interview after such BS?

Because it may actually be worse for them if they don’t?  Some reporters I think (and I’m not saying Klapish is one of them) will attack players if they don’t feel they are treated with respect.  So not giving the interview the article instead could be, “Burnett was lucky to get through the 4th, never mind get the win”.

However, it’s a statistical fact that at this point in the season, just about every team has a non-zero chance to win 100 games.

Sounds like a fun project.  How many teams that going into the season that had “no chance” of being that good, finished with a winning pct above .615?  .615 is basically 100 wins in a 162 game schedule, and I think that may be more interesting than setting 100.  That would be 94 wins in 154 game schedule.

CAIRO *may* be able to tell us that.  I know in the past you (SG) have given some projections for past players.  But I’m not sure how well CAIRO would handle pre-Retrosheet players (though we could limit it to 1955 forward), or if you have a machine with enough horsepower to do it in a reasonable amount of time.  If I were ambitious, I’d do it myself…I can say that about a lot of things.

Some reporters I think (and I’m not saying Klapish is one of them) will attack players if they don’t feel they are treated with respect.

This couldn’t be more apparent in Abraham’s treatment of A-Rod: he despises A-Rod because A-Rod is indifferent to the press except for when it suits him.  So Pete is on the offensive all the time.  The sports media really is an absolute joke.  You’ll have the occasional Rob Neyer or peripheral publication like BP, but otherwise, these guys are all-red-meat all-the-time, and self-important to boot.  It’s disgusting.

Whoa, Kris Benson is making a start for the Rangers.

This Monte Carlo simulator accounts for that by randomly modifying the teams’ talent level slightly during each iteration.  So in some iterations, the Yankees are a 105 win team, in some they are an 80 win team, etc., That’s why you’re seeing a higher standard deviation than we saw in the Diamond Mind projections.

Ahh, I see, so the Monte Carlo simulator is adding noise to the prior, which increases the SD of the prior relative to what was there in the predictions. Makes sense now.

For a sports reporter to call Sabathia and Teixeira a waste of money two games into a 162 season is like a U.S. president not knowing where to find Canada or Mexico on a map.  This goes beyond run-of-the-mill dunderheadedness.  It’s double digit IQ, Ian (if you don’t trade for Eric Gagne your season is over) O’Connor type stupidity.

Whoa, Kris Benson is making a start for the Rangers.

Take that junk to the RLMB.

Using the Monte Carlo simulation with production volitility seems to make more sense than straight projections.  This seemingly would account for the win/loss effect of Player_X outperforming while Player_Y underperforms.  Even if their full year projections are exactly the same, their placement in the batting order etc. would cause run differentials leading to win/loss variability.

Obviously as we get farther into the season, most teams will regress towards their expected talent level and you’ll see fewer 100 win/100 loss teams.

I would imagine that if we “Monte Carlo’d” the 2008 season we could compare the day by day projections to the season ending actuals to determine at approx what point in the season the projection becomes reliable.  Basically, when does Monte Carlo say that the mid year sample size of games completed is large enough to accurately project ending standings.

Using the Monte Carlo simulation with production volitility seems to make more sense than straight projections.  This seemingly would account for the win/loss effect of Player_X outperforming while Player_Y underperforms.  Even if their full year projections are exactly the same, their placement in the batting order etc. would cause run differentials leading to win/loss variability.

Yeah, I think so.  I think what I will do going forward is run the 1000 iterations in Diamond Mind like I’ve done in the past to build a base of team expectations as far as runs scored/allowed, etc., and then feed that data into the Monte Carlo simulator and run 10,000 more iterations and then present both datasets.

I would imagine that if we “Monte Carlo’d” the 2008 season we could compare the day by day projections to the season ending actuals to determine at approx what point in the season the projection becomes reliable.  Basically, when does Monte Carlo say that the mid year sample size of games completed is large enough to accurately project ending standings.

I like this idea a lot, but it may take me a bit to figure out how to rig the spreadsheet to do that.  I’ll see what I can do over the next few weeks.

Projections, schmojections.

The Yanks aren’t gonna be in the post-season. I heard it on ESPN, and that’s all there is to it. See ya in 2010.

I’m not sure I understand the RS projection. At the moment the Yanks have averaged 7 runs per game. Now, I don’t think 7 is sustainable but I do think 5.5 is and that projects to 891. Am I being overly optimistic?

14 what reminds me of Giambi with Swisher is that with Giambi I felt like every at bat went like this:

Called Strike
Swinging Strike
Foul
Ball
Ball
Ball
Foul
Foul
Ball

Swisher didn’t walk yesterday, but he just seems to have the same knack for making pitchers throw a lot of pitches.

It’s not really his appearance that reminds me of Jason, but he does seem to have the same energy, minus the expectations of being the team MVP.

Am I being overly optimistic?

Relative to preseason projections, yeah.  However, there’s reason to think the Yankees have more upside offensively than projected.

1) I assumed Jorge Posada would only get about 400 PA.  If he’s closer to 500 PA, that’s probably another 5-10 runs.
2) I assumed Alex Rodriguez would only get about 500 PA.  If he’s back ahead of schedule he should get more than that.
3) Maybe Gardner outhits his poor projections?
4) Maybe Cano is closer to his 65% projections (.307/.349/.495) than his baseline(.296/.332/.465)?  That’s around 10 more runs.
5) Maybe Matsui plays more than expected (400 PA) and exceeds his projections since they are partially skewed by his post-injury return last year?

Still, I wouldn’t read too much into a few high scoring games against Baltimore.  They’re going to have 38 games against two of the better run preventions teams in baseball (Boston/Tampa Bay) and Toronto may be strong there as well.

6) Jeter may have been more hampered with small injuries in 07-08 than recognized.

so is the Royals broadcast streamed?

Delayed for ceremony?  In KC?  What, morituri te salutant?

Page 1 of 1 pages:
1 of 963 registered readers are currently logged in.
There are currently 69 visitors who are not logged in.
There was a record 241 simultaneous visitors on May 2, 2011 at 11:54:25 pm.

Logged in users: PredX


Does Robinson Cano’s Approach Change With Men on Base?
(50 Comments - 1/26/2010 10:44:25 am)

2010 CAIRO Projections v0.2
(14 Comments - 1/25/2010 10:56:33 pm)

One Of The Following Stories May or May Not Be True
(26 Comments - 1/25/2010 1:51:23 pm)

What Happened to Wang?
(13 Comments - 1/24/2010 11:53:14 pm)

NY Times - Glanville: Seeing is Disbelieving
(62 Comments - 1/24/2010 9:27:27 pm)

RealGM Baseball: Yankees Among Teams Interested In Edmonds
(3 Comments - 1/23/2010 4:52:40 pm)

Should Jesus Montero Be an Option for Left Field?
(65 Comments - 1/22/2010 10:24:20 am)

CAIRO Projected 2010 AL East Standings as of January 16
(35 Comments - 1/21/2010 2:53:01 pm)

MLB.com - Bauman: Yankees appear stronger
(18 Comments - 1/21/2010 5:21:26 am)

TSBG Versus High and Low Fastballs
(5 Comments - 1/20/2010 9:00:27 am)



*ADVERTISEMENT*
Our new URL is: http://www.rlyw.net
*ADVERTISEMENT*

*ADVERTISEMENT*

image
Way back in the 20th century, Bill James wrote the first essential book about baseball managers. Chris Jaffe has just written the second.
- Rob Neyer, ESPN.com

From now on, whenever I have a question about a manager, Jaffe's book will be the first and last one I reach for.
- Sean Forman, Baseball-Reference.com


*ADVERTISEMENT*

*ADVERTISEMENT*
John Brattain Memorial Fund

The Hardball Times has set up a memorial fund for John Brattain's family. He left behind a wife and two teenage daughters.

Four years ago, I found from personal experience how generous the online community can be to its own in their hour of need. I am now literally begging you to be even more generous than you were to me.


*ADVERTISEMENT*

*ADVERTISEMENT*

*ADVERTISEMENT*

*ADVERTISEMENT*

*ADVERTISEMENT*