Tuesday, January 15, 2008
Pitch F/X and Joba Chamberlain
In 2007, MLB decided to enhance their Gameday application. As a numbers junkie (aka statdork), the best thing that's come out of that in my opinion is the rolling out of detailed pitch by pitch data. Data for 2007 is incomplete as it's been rolled out slowly across MLB, but it now appears to be available in most of the stadiums and should be everywhere for 2008.Pitch F/X is the name of the system that MLBAM is using to track the detailed pitch information. They record things like starting velocity, ending velocity, release point, break, and all other kinds of stuff. Having the data available is great, but figuring out how to get it and make use of it looked like a royal pain in the ass.
Thankfully, someone else did all the work for me as far as figuring that out. Over at the blog Fast Balls, Mike Fast detailed how to build a pitch by pitch database. I used Fast's instructions and have been downloading and loading the data for the past week or so. This is great stuff and I thank Fast for making his database building instructions open source.
There's a ton of data in here that I don't fully understand yet, and as I mentioned before it's pretty incomplete because the system has taken a while to roll out across MLB, but I think some data is better than none.
I am hoping to make use of this data in some of my posts in 2008, especially when it comes to Joba Chamberlain, Phil Hughes, and Ian Kennedy. Today I'd like to look at Joba specifically.
We all saw how Joba burst onto the scene with his high 90s fastball and nasty slider, throwing 24 innings of 1192 ERA+. He does have a curve and changeup, but neither was used very much as a reliever. For Joba, Pitch F/X had detailed pitch information for 225 of his 334 pitches.
The first and easiest thing to look at would be the breakdown of his pitches thrown.
Fastballs: 139
Sliders: 79
Changeups: 7
Next, here are the high, low and average velocities for those three pitches.
Fastball
High: 101
Min: 92.9
Avg: 97.3
Slider
High: 90.4
Min: 82.5
Avg: 86.6
Changeup/Curve (I can't tell which is which from the data)
High: 84.5
Min: 79.4
Avg: 77.2
Lastly, here's a look at what the end result has been for the pitches we have data for.
| Pitch Type | Result | # |
| Fastball | Ball | 45 |
| Ball In Dirt | 1 | |
| Called Strike | 30 | |
| Foul | 26 | |
| Foul (Runner Going) | 0 | |
| In play, no out | 6 | |
| In play, out(s) | 19 | |
| In play, run(s) | 2 | |
| Swinging Strike | 10 | |
| Swinging Strike (Blocked) | 0 | |
| Pitch Type | Result | # |
| Slider | Ball | 27 |
| Ball In Dirt | 2 | |
| Called Strike | 9 | |
| Foul | 7 | |
| Foul (Runner Going) | 1 | |
| In play, no out | 0 | |
| In play, out(s) | 5 | |
| In play, run(s) | 0 | |
| Swinging Strike | 24 | |
| Swinging Strike (Blocked) | 4 | |
| Pitch Type | Result | # |
| Changeup/Curve | Ball | 4 |
| Ball In Dirt | 0 | |
| Called Strike | 2 | |
| Foul | 0 | |
| Foul (Runner Going) | 0 | |
| In play, no out | 0 | |
| In play, out(s) | 1 | |
| In play, run(s) | 0 | |
| Swinging Strike | 0 | |
| Swinging Strike (Blocked) | 0 |
I'm not sure how accurate this data is yet, or how consistent it is from park to park. I'd imagine that this year's data will be fine-tuned a bit based on last year. I love the potential for new information that this technology will give us though.
Thanks again to Mike Fast for his open-source instructions on parsing this data, it was a huge help.
Update: Thanks to mmiller for pointing out this link to Josh Kalk's work. Thanks to Kalk for his work and data as well.
Comments
You shoiuld actually thank Kalk, the guy who created the system. Anyway, those are Joba’s numbers, you can see the difference between the curve and the changeup here.
Thanks, I wasn’t aware of that link. Good stuff.
mmiller, any objection if I edit your link to fix the page width?
Looking at the link in #1, maybe I am reading it incorrectly, but shouldn’t all the Break z numbers be negative?
SG, I’m glad you were able to make good use of my instructions for the pitch database. The PITCHf/x stuff can be addictively fun. I’ve learned a ton about pitching from studying it. You can call me Mike, btw. Mr. Fast is my father.
I wrote a couple articles on Joba, too. You might find them interesting. Hopefully it’s okay to share the links here:
http://fastballs.wordpress.com/2007/11/15/appeasement/
http://fastballs.wordpress.com/2007/11/18/swinging-at-shoe-tops/
I also published a primer on PITCHf/x just yesterday, and it’s a good starting place for further research ideas and guidance:
http://mvn.com/mlb-stats/2008/01/14/a-pitchfx-primer/
mmiller, the system was created by Sportvision and the data was produced by Sportvision and MLBAM. Josh Kalk made a database from MLB’s data and created an algorithm to automatically identify pitch types. He then made player cards for every pitcher with 100+ pitches recorded by the system and shared this data freely with everyone. It’s an incredible resource, but you have to be careful because it’s not always correct.
Randy/#4,
The “break” parameter that’s presented in Josh’s graphs is really the spin-induced movement. The backspin on a fastball creates a force (the Magnus force) that pushes the ball up. This is the “break” that’s shown there. The Magnus force doesn’t completely counteract gravity, so even a fastball still drops on its way to the plate, but Josh’s graphs don’t show the effect of gravity at all.
Thanks for stopping by Mike. Feel free to plug away.
If it doesn’t show how the ball actually moves, what good is it?
I think the idea is to look at how it moves relative to other pitches, from that particular pitcher as well as other players.
Exactly. It’s good for pitch classification--comparing one pitch to another. For understanding how the ball actually moves, it’s only part of the picture. People have done lots of other things with the data. Josh’s presentation is only one way to look at the data.
Because Josh has something for every pitcher, you’ll see his work everywhere, and some people, like mmiller, have gotten the impression that Josh’s presentation actually IS the data, which is not true.
I’ve moved away from presenting the data in that fashion much at all, but it does have some physics meaning and is useful in that regard.
I’m pretty sure those “changeups” were curves. I saw him throw several curves, but cannot recall a single changeup. And I’m pretty sure I saw every pitch the kid threw.
Cool stuff
By the way, has anyone else seen the “breakout” article on BP? One of the picks is none other than:
“Melky Cabrera. Cabrera went backwards in ’07, but not by enough for concern. Remember that he is just 23 years old and has more than 1100 plate appearances in the majors, with average to average-plus defense (good physical tools, but very raw, takes bad routes) and a very good 129/96 K/BB. He is a mature player offensively, patient at the plate and fair on the bases (25-for-35 stealing in his career). One interesting quirk is his G/F ratio, which is 1.63 for his career and was a whopping 1.81 last season. Cabrera is listed at 5’11” and 200 pounds. He’s not Willy Taveras, but rather a player who should be developing power and learning how to drive the ball, rather than hitting the ball on the ground 60 percent of the time.
I’m reminded of Alex Rios, who doesn’t look a thing like Cabrera. Rios was largely disappointing in 2004 and 2005, hitting just 11 homers in more than 900 at-bats, with an isolated power of 117. The problem: Rios was hitting the ball on the ground too much, a 1.82 G/F in those two seasons. Starting in ’06, Rios put the ball in the air more than half the time, and became a star. When you look at Cabrera’s body, his established control of the strike zone, and his ability to hold his own at a young age, you recognize that all it’s going to take is for him to start elevating the ball. Cabrera may not get there in 2008, but he’s going to pop 80 extra-base hits and slug .500 in a season very, very soon.”
That last bit strikes me as pie in the sky (slg .500? Melky? I’d be really impressed, especially if it happens soon).
Yeah, I think Robbie Cano kidnapped some one in Joe Sheehan’s family and threatened to do horrible things to them unless he talked about Melky flatteringly. Sheehan, of course panicked, or it is an attempt to call for help.
he’s going to pop 80 extra-base hits and slug .500 in a season very, very soon
Definitely a bit over the top. I think Melky would have to hit .350 to slug .500, and I don’t see that happening any time soon. OTOH, I’d love to see him shove 50 XBH and a SLG of .450 down the throats of the “just a 4th OF” crowd, and I think that’s quite possible.
I don’t see a .500 SLG with Melky either. I’m not so sure it’s so easy to just decide “I’m going to hit more balls in the air.”
Melky only slugged over .500 once in AAA (.566 in limited AAA action before being called up) so I agree it’s far fetched. I could see .450 I suppose, but .500 is a stretch. If Melky is a guy who can go into the low to mid .800s with his OPS and play above average defense with that cannon for an arm, he’s going to be in center field for a long time for the Yankees.
The comparison he uses (Rios) is a stretch to - Rios is a big (tall, anyway) guy. It’s easier to project power from him than it is to look at ‘lil Melky and say he’s gonna slug .500. Did Rios hit for power in the minors? I bet he did.
I’d love it if Sheehan is right. But I doubt it. I’d happily take a ~.800 OPS from Melky, though, without the Sept/Oct vanishing act. I think he’s capable of that.
Sheehan is smoking something that i would like to get my hands on.
i appreciate his efforts, but 80 XBH’s is simply insane.
he is basically saying that in 2009, Melky Cabrera will be one of the 5-10 best players in MLB.
Hall of Famer Harry Heilmann, one of Cabrera’s top comps at B-Ref, popped 73 XBHs in his age-28 season in 1923.
It can be done!
Seriously, as you all probably remember, I was in favor of trading Melky as part of a package for Johan. But I sure would like to see him develop into an even better player while wearing the pinstripes.
Rios’ minor league numbers are pretty damned unimpressive, except for his age 22 season at AA. Even in that .919 OPS year he only hit 11 homers.
BTW, “‘lil Melky” is the same size as Willie Mays was when he broke into the majors. Garry Sheffield is 5’11”. So was Don Mattingly. You don’t have to be 6’3” to hit for power.
But 80 XBH is a freakin’ ton, even in this era of inflated offense. A lot of damned good hitters never get that many. Especially if they walk 10% of the time.
Maybe he won’t produce like Rios, but I think Melky can definitely improve. He has good plate discipline, which in my opinion is harder to obtain than power. Plus, at 5’11”, 200, he’s no small fry. He hits the weights and works on his stroke and he could definitely add some pop.
That’s what I love about baseball - sure bets often flame out and all-stars come out of nowhere. I hope Melky can join the ranks of the latter.
I’d still trade Melky to get Santana.I may be one of the few left but I feel Santana is a must for the Yankees.He is young proven and most of all a lefty.We are opening a new stadium and we need a real ace not someone like Wang who gave us no and i mean no chance of winning two playoff games.So you lose Hughes you still have Joba.All you would need is to get a centerfielder and the last time I checked we had one in Johnny Damon.Granted Melky is an upgrade but Damon is still viable and will be in the lineup anyways.
Mike - wait a year and sign Santana as a FA. You can then keep Melky and Hughes. Best of both worlds and you still get Santana to open the new park.
That may be possible provided the Red Sox don’t get him.I have always hated being afraid of what they will or will not do but as we can all attest the Beckett deal still keeps hurting us and the possibility of a declining lineup and a strong suck sox 1 2 punch makes me eager to try harder to get him now.
wait a year and sign Santana as a FA
I still don’t think he gets to the open market. He’ll either get traded (hopefully to the NL) or the Twins will shock everyone and re-sign him. Contrary to the whiny small-market propaganda, they can easily afford him with their own new ballpark opening in 2010. I’ve seen estimates of close to $150M for Minnesota’s 2007 revenue.
the comma is your friend.
I don’t know about Rios as a comparison. He broke in with Toronto a couple of years older than Melky. If Sheehan’s argument has any merits, it’s that Melky has about 1.5 years of full MLB experience under his belt and he is only entering his age 23 season.
Also, if you look at Melky’s 06 and 07, something weird is going on. He slugged closed to .400 against LHP in 06, but was about 40 points worse vs. RHP. In 07, he slugged just over .400 against RHP but was 50 points worse vs. LHP. His BA and OBP from both sides and for each year are comparable.
I don’t know what to make of those splits except this: it looks like he has the ability to slug .400 from both sides of the plate, and he’s only 23. I suppose he did go from having Mattingly as hitting coach to having Kevin Long, so who knows if there was an adjustment made by one of those guys that can be pointed to. I could see him putting it all together and slugging .450. .500 would be awesome.
80 XBH’s is a bit ridiculous - Jeter’s max was I think 70, and he’s been a *fairly* good hitter for his career - but .500 SLG isn’t quite as far out there as you may think For example, look at Jeter’s 2001 season:
H 2B 3B HR BA SLG
191 35 3 21 .311 .480
Good - very good for a SS - but not a super-great season right? Now just make a few of those singles into XBH
H 2B 3B HR BA SLG
191 40 5 21 .311 .507
No guarantee Melky will do that in 2009/2010, but does everyone here really think he couldn’t?
may think. For example
Periods can be my friend, too.
I think the problem is that Melky’s isn’t even slugging .400 now. Jumping 100 points would be a big leap for him.
No guarantee Melky will do that in 2009/2010, but does everyone here really think he couldn’t?
No, i don’t think Melky can do that. and i am a Melky fan.
the comma is your friend
But you can’t trust the semicolon.
The thing that I think is important to remember about Melky is that he can be extremely valuable if he keeps putting up .900+ ZR’s and gunning people out at the plate. If he shows some power, he’s got a future being moved over to a corner infield spot, but as an .800 OPS guy who hits from both sides and dazzles with the glove, he’s a nice part of this team. I got the impression that he was considered a throw in on the alledged Santana deal and that made me.. not happpy.
the comma is your friend.
Thats what people told me then BAM! two months later I’m hooked on smack, stuck in the middle of the desert and comma’s fading into the distance with my car and the last of the water.
"Thats what people told me”
You misunderstood--they meant the apostrophe was your friend.
We are opening a new stadium and we need a real ace
As if people aren’t going to show up to the new Stadium if we don’t have Santana. This, in my mind, is not a legit reason to make the trade. But I would say this, Mike; I wouldn’t let Melky stand in my way of completing a Santana deal. Apparently he’s not the sticking point.
You misunderstood--they meant the apostrophe was your friend.
I feel humbled. Well done.
Sheehan went way overboard there but i think some fans are starting to sell Cabrera short way too quickly. I still think he has a very good chance of being a above average player. that has a ton of value espically before FA.
Quick question…
Obviously, it is totally logical to say you think the Red Sox will be better than the Yankees next year.
It is equally logical why you would say the Tigers will be better (it’s definitely arguable, but you’d obviously understand why someone would make the argument) than the Yankees next year.
But what is the reasoning for thinking that the INDIANS will be better than the Yankees next season?
Not saying they won’t be better than the Yankees, but just looking at things pretty objectively, I don’t see their case being that strong.
You’re looking for logical reasoning to explain widespread Yankee-bashing?
Seriously, though, I think the Indians have a better case than the Tigers. The reasoning is simple—Cleveland was better last year, the Yankees haven’t added an ace, and and nobody ever regresses (except the Yankees, of course, who must project to crash and burn every year). Detroit made the big off-season splash, but they’re the ones who needed to improve.
Of course, people also conveniently ignore the fact that when Detroit adds wins, some of them will come at Cleveland’s expense.
Next entry: News Inferno: Yankee Fan Sues Over Major League Baseball Scandal
Previous entry: ESPN: Source: Yankees pull trade offer for Santana
There are currently 46 visitors who are not logged in.
There was a record 234 simultaneous visitors on August 30, 2007 at 4:30:39 pm.
Logged in users: Bloody Mary, congo23, dakranker, dlev, Ed, PagsRags, Tree









