PECOTA and the 2005 outfield
So the 2005 PECOTAs are out and available to BP subscribers. Here’s the interesting bit –I almost wonder if it ran the same stat lines three times for the outfield.
Ibanez: .270 .324 .420
Reed: .286 .353 .423
Winn: .277 .337 .420
Yeaaaaghhhh. Play Reed! Play Reed!
Mike Cameron, btw: .253 .348 .461 in Shea. Hee hee hee.
Ichiro! .311 .355 .415
Others have noted this, but this is the weakness of a system that relies on similarity projections. Ichiro is so far removed from other players that finding anyone like him has the unfortunate side effect of dragging him back to the pack. Does that make sense? How about this: if you try and locate a position by triangulation, that works best if you can get three sources in different directions the position. The closer they are, the worse your ability to locate.
Ichiro! is at the north pole. The pack is all in Vermont, say.
One thing I want to note — there is no meaning behind these, or any other projections. They’re spit out by a computer, and should be attached all the prophetic value of weather forecasts. If it spits out something you agree with, that doesn’t mean it’s right and that it’s made a realistic projection (because you don’t, and can’t, and won’t ever, know what that is) and if it projects something bizarre, that doesn’t mean it’s wrong, in the same sense.
If you’re a real geek about this stuff, like me, and you read the forecast discussions on the NOAA web site, you can spend some time once the cards are published picking apart the “why” of the forecast. For instance, if you look at Beltre’s low projections and see that PECOTA’s picked a bunch of players of modest talent who started fast, burned out, and then had a career year, you might reasonably disagree on the basis of who got picked.
But that a particular forecast came out in some way does not validate or refute any particular view. 1+1 doesn’t = 2 because it’s got better lineup support this year. If you re-run PECOTA forecasts on the same set of data, you would get the same numbers. It’s a complicated formula, an algorithm for making guesses. Its reasoning reflects its design, and not some greater truth about any player, though in doing historical analysis of great numbers of players we do learn useful things.
Comments
43 Responses to “PECOTA and the 2005 outfield”
But doesn’t PECOTA use a “Marcel the Monkey” baseline and then adjust that based on how comparable the comparables are? That 311/355/415 line looks a lot like a forecast that Ichiro will lose a few singles, which isn’t a surprise for a player entering his thirties.
While we’re looking at PECOTA numbers, here’s PECOTA projections for two Mariners:
Player A: .254/.347/.480
Player B: .254/.341/.473
Hint: one of these two players will make $44 million the next 4 years. The other one’s Bucky Jacobsen.
I need a drink now…
Well, yea… I mean, if we want to go all Bertran Russell on ourselves, even 1+1=2 doesn’t “mean” anything either. All it means is that the system’s accepted axioms self-reflect.
I see your point, but it almost seems a little too “evolution is a theory” to me. True, but misleading because of the disconnect between two types of discourse.
I’ll take PECOTA over waiting for spring training stories about who’s “focused” any day.
Thanks for the ‘actuals’ here Derek, such as they are and may be. I like PECOTA better than the next best thing—which is nothing, but still. If there is one thing that baseball FOs consistently underpredict, though, it’s player declines, and statistically-based ‘most similar’ comparisions go a long way toward correcting mis-evaluations there. In my view, these are perhaps the most significant contribution out of the stathead community so far along with minor league conversions.
The data does tend to suggest that it’s ‘pick ’em’ in the Ms outfield for ’05, yes.
Thanks for the numbers on Jacobsen there, e-coward. And that supports my sense of the ‘actuals’ here also: Bucky Jacobsen projects to be a more valuable offensive player than Ibanez, Winn, Reed, _or_ Cameron. And that’s using numbers the Big Guy put up playing on a crumbling knee, which he figures to have corrected by the next time he’s in the line-up.
Reed, Jacobsen, and Madritsch are the three main reasons I plan to come to the park in ’05. I could take a shine to Beltre, though.
I agree that these PECOTA numbers should be taken with huge chunks of salt. Really, how many comparables did PECOTA find for Jacobsen (big guy, entered the Majors at age xx, played through years with knee problems, recently had surgery, blah blah blah) and Sexson?
Well, look at the guys- big strong RH power hitting 1B, around the same age.
Granted, Jacobsen got a later start (Sexson was a HS draftee, Bucky went to college) and has injury history influencing his numbers, but Bucky DOES have a minor league OPS of around .920. He’s always been able to hit.
Basically, Bucky’s in the Jim Gentile/Steve Bilko/Mike Easler/Ken Phelps lineage of “guys who get caught in the minors due to various cominations of bad luck, injuries, not getting hot at the right time and teams that don’t believe in minor league stats”. There’s lots of these players throughout major league history.
Ichiro! .311? I think Rod Carew is the closest hitter that we can compare Ichiro! to and he hit .330 at 30 and .388 at 31 so I say PRECOTA is hogwash and Ichiro! hits .400 this year!
There are a lot of factors that can go wrong/right that can wreck PECOTA’s predictions. Did it take into account that Jacobsen’s right knee will require “more drilling and scraping” because the “divot” was twice as big as the MRI showed? Comparing Jacobsen and Sexson based on PECOTA’s projections is just not a valid proposition (and will really worsen your drinking habits) IMO.
Lest we forget, Ichiro Suzuki hit .312 in 2003. Abberration?
I noticed that PECOTA doesn’t project a single 40-hr hitter in the majors this year. Is PECOTA typically super conservative with power numbers, or is this abnormal?
That’s a good question, and I don’t have the answer. I would venture a guess, but I just wrote that whole thing about the futility of such an exercise. I don’t know.
Derek, Do you really think PECOTA forecast are meaningless, or just of limited value? This would seem to be critical, because if a ‘reasoned’ forecasting system had no value, how do you assign value to players and determine the quality of a free agent signing?
I think it is internally inconsistent to say forecasting is meaningless, be it via PECOTA or seat-of-the-pants, while simultaneously saying Russ Ortiz’s contract is terrible. I suspect your comment was just hyperbolic, but shouldn’t the extreme difficulty of projecting future performance temper our opinion of the front office’s decisions?
Case in point; did Gillick get lucky with Arthur Rhodes and Bret Boone? What about Mark McLemore? Or was he a shrewd judge of veteran talent? I am on the fence regarding Gillick. On the one hand, when I view the Mariners’ performance from ’96-’99, I can’t help but see it as disappointing. Then Gillick arives and we have the golden agelette of ’00-’03. Sure Gillick would appear to be much to blame for the train wreck of ’04, but a four year run of success should not be dismissed. On the other had, is the primary difference between ’96-’99 and ’00-’03 simply payroll? Those late ’90s teams had terrible depth and many black hole positions. If Woodward had had access to the same money as Gillick for pitching depth, acquiring a left fielder and a third baseman, would I view him differently? Assigning credit and blame after the fact seems difficult, so projecting it to the future seems extremely difficult.
Weather forecasts, to beat that analogy further, aren’t meaningless. They offer an ability to better plan and be prepared. But weather forecasts aren’t (and can’t) reflect what will happen, or why it will happen.
That’s all. We have to recognize that predictions are by their nature limited in utility and aren’t themselves results.
I think we’ve been quite good here in talking about the Sexson signing, for instance, in drawing a distinction between what we predict and think is a reasonable expectation for Sexson, what the risks are, and why we felt his contract was too expensive, given our views on that.
One of the best things about PECOTA as a projection system is that Nate lets us see the error bars. They’re not in the released data, but when the PECOTA Player Cards go up we’ll be able to see how confident PECOTA is in any given projection.
And for highly unusual players (Ichiro, Bonds, David Wells), the error bars are HUGE.
So, once we get all the data, we’ll see that while the weighted means may look strange, the tails of those bell curves are sometimes really really long.
Oh, and doesn’t Bucky’s career path look just a little like Edgar’s career path?
Am I turning into David Corcoran?
Ichiro’s numbers in 2005 will depend heavily, I think, on whether other teams continue to play their infield in against him all the time. Or are they going to realise that they’re giving him hits by doing that? Will they heed the words of David Eckstein, who said that whenever the Anaheim infield would reposition just for Ichiro, he’d come up and “hit it where we were“?
If anyone wants a reason to drink this morning, take a look at Dan Wilson’s forecast. Mmm mmm, negative VORP. Remind me again why he is coming back.
#18
Classic example of home-town boy with connections to the team’s “glory days,” such as they were. I really don’t have a problem bringing him back to be second behind Olivo at catcher, but if for ANY reason he’s doing the majority of the catching this season, it’s aaalllll on Grover.
There are a lot of factors that can go wrong/right that can wreck PECOTA’s predictions. Did it take into account that Jacobsen’s right knee will require “more drilling and scraping†because the “divot†was twice as big as the MRI showed? Comparing Jacobsen and Sexson based on PECOTA’s projections is just not a valid proposition (and will really worsen your drinking habits) IMO.
I think the point is that Sexson’s skill set wasn’t so unique we needed to pay $50 million and change for it for the next 4 years- seeing as Bucky basically fits a lot of the bill (RH power hitter at 1B/DH) for a lot less money. Bucky’s minor league stats, like many other hitters who’ve been trapped in the minors for a long time (the Ken Phelpses and Steve Bilkos of the world), show he’ll hit once he gets to the show. If you project him out to 162 games from last year’s stats, he’s a 35 HR/100 RBI hitter. I think we’d all be OK with Sexson doing that.
Anyhow, Wilson SHOULD be a backup caddying for Jamie and playing on Sundays this year…but if Olivo’s not up to it, you can guess what will happen, and even the pattern (Wilson will hit decently in April and then tank by June).
What’s the NOAA?
National Oceanic and Atmospheric Association.
does pecota ever predict that anyone will have a better year than they had the year before?
. . . doesn’t Bucky’s career path look just a little like Edgar’s career path?
How so? Other than breaking into the bigs late, their career paths seem totally different to me.
At Bucky’s age, Edgar was a skinny high average, low power 3B. Bucky is a massive power hitter.
Edgar’s highest SLG in the minors was .490. Bucky slugged .661 in Tacoma last year.
I guess I’m missing the connection between the National Oceanic and Atmospheric Association forecasts and player projections, unless it’s just that “all forcasting is similar.”
And #23: It should, if the player’s previous season was significantly worse than his historical norms; an anomoly. Also I think it might if the player had an insignificant sample set that was below league average, it would push the player’s projection towards league averages. Regression towards the mean.
While PECOTA does tend to be on the conservative side, yes – it predicts that many players will improve from their performances last season.
It does, but not often. I looked at the list of players that PECOTA predicts will qualify for the batting title (104 players), and it only predicts improvement for 16 of them, or 15.4%. The magic sixteen are mostly the guys you’d expect, who were big disappointments last year, along with a few younger guys still on the upswing (Bobby Crosby, Jose Lopez, etc).
Doot doo doo, just testing something, don’t mind me.
I think forecasting is inherently useful, and as forecasting goes PECOTA is pretty good. There are at least two issues to take into account generally with forecasting, though.
As alluded to indirectly by Evan above in #15, how large the standard deviation is within the data spread greatly influences how _useful_ the forecast is without impacting at all how accurate the forecast is. If the standard deviation is fairly large (i.e. a large amount of error in the forecast), a player could miss his median by, to pick a number, 17% but be exactly on predicion, that is within a standard deviation, or could miss his median by a whopping 34% and still be only off by one standard deviation: the forecasts were either quite accurate or fairly accurate but the divergence of the actual number makes the forecast only moderately useful. This is why the error readings which Evan alludes to are critical as a means of handicaping the utility of the forecast estimate.
Then there is the issue of multi-factor analysis in a forecast like PECOTA or the weather. More factors -> more individual divergence -> wider aggregate divergence and more importantly more diffuse median: data just doesn’t bunch around a nice tight median in multi-factor analysis without an unusual attractor in the data. If baseball forecasts were more single-factor, they would tend to be more accurate. For example, I suspect that it would be easier to estimate a projection for what percentage of a player’s hits would go for doubles, a relatively narrow factor, than to estimate a projection for his slugging percentage, inherently a _multi-factor_ statistic which a dozen home runs or three dozen singles also move up or down. This is not a criticism of the boys west of the PECOTA, I’m sure that they give a great deal more thought to all of this than I do or care to. The broader point is that baseball performance is inherently a multi-factor output, and to try and estimate it usefully, one necessarily estimates it broadly, which inherently means that the divergence of actuals from estimates will be substantial even while the estimates themselves may be comparatively ‘accurate.’
Only if one introduces some external factor into baseball performance as an ‘attractor’ would the data suddenly bunch more predictably. Or to invert that statement (at some risk), if inherently spread data suddenly starts to bunch, one should investigate to see if there is an external attractor—like aluminum bats, rabbit balls, or . . . the like.
Forecasting through multi-factor analysis has become increasingly accurate, in baseball and elsewhere. It’s usefulness lags behind, but this is due to the behavior of what is being measured not the measurement methodologies per se.
Regarding Bucky Jacobsen, I keep thinking Cecil Fielder when I look at him, although the ‘most similars’ of Ken Phelps, Steve Bilko, and Mike Easley are much more defensible, sure. However, those are made against seasons where Bucky was hitting on a crumbling knee.
One would expect a guy of Bucky’s strength to be able to ‘fist’ the inside ball for singles and even doubles and generally be a very _tough_ out inside unless he was so musclebound as to be simply tied up inside. Now, in print a few weeks ago, Big B said that the knee prevented him from turning on inside fastballs almost entirely during his time with the Ms, allowing the pitchers to get him out inside which had NEVER been the case before he hurt the knee. So consider that Jacobsen was hitting like Easley despite being _minus_ his greatest strength prior strength to his account, the ability to crush inside pitching. We can’t know how well Bucky will heal from the knee surgery, but if he regains the ability to drive the inside pitch . . . . well, the opposing thirdbasemen and shortstops are going to need body armor, that’s all.
And the Tiges got Fielder, too, for nothing. Guys like Jacobsen are generally ignored because so many of them don’t pan out at all, witness the previous back-and-forth on Ryan Howard on this blog. But when they do, the GM sure looks good, yessir.
I’m a Baseball Prospectus subscriber who also plays fantasy baseball and when the PECOTA numbers came out last year, I read the thought process behind them and thought it sounded good, in theory. There are a lot of projections out there each year. Heading into this season, I wanted to see how PECOTA did compared to others for last season, so I ran some numbers and this is how it turned out:
182 position players analyzed (roughly 20-25 starting players at each position). These are the averages per hitter.
Actual ’04 stats: .358 OBP, .480 SLG, .838 OPS
ZiPS projections: .362 OBP, .468 SLG, .830 OPS
RotoChamps proj.: .348 OBP, .482 SLG, .830 OPS
Forecaster proj.: .352 OBP, .474 SLG, .826 OPS
PECOTA project.: .349 OBP, .458 SLG, .807 OPS
ZiPS are projections from Baseball Think Factory.
RotoChamps has their own website and projections.
Forecaster is Ron Shandler’s book, with accompanying website Baseball HQ.
PECOTA is the Baseball Prospectus system.
As you can see, PECOTA performed the worst, much lower than actual in projecting OPS, with slugging percentage being the main culprit. ZiPS (from Baseball Think Factory) was closest to predicting OBP (off by .004), while RotoChamps was closest in SLG (off by .002).
For pitching, PECOTA did much better.
79 starting pitchers analyzed. These are the averages per pitcher:
Actual ’04 stats: 4.10 ERA, 1.31 WHIP
PECOTA project.: 4.10 ERA, 1.33 WHIP
ZiPS projections: 3.95 ERA, 1.25 WHIP
RotoChamps proj.: 3.91 ERA, 1.28 WHIP
Forecaster proj.: 3.88 ERA, 1.29 WHIP
Here, PECOTA was pretty damn good.
Anyway, back to Ibanez, Reed, and Winn (and Ichiro) for 2005. Here are some numbers from ZiPS and RotoChamps for each of them for 2005:
Ibanez (ZiPS): .339 OBP, .456 SLG, .795 OPS
Ibanez (Roto): .347 OBP, .479 SLG, .826 OPS
Ibanez (PECOTA): .324 OBP, .420 SLG, .744 OPS
Reed (ZiPS): .360 OBP, .412 SLG, .772 OPS
Reed (Roto): .385 OBP, .463 SLG, .848 OPS
Reed (PECOTA): .353 OBP, .423 SLG, .776 OPS
Winn (ZiPS): .348 OBP, .424 SLG, .772 OPS
Winn (Roto): .343 OBP, .428 SLG, .771 OPS
Winn (PECOTA): .337 OBP, .420 SLG, .757 OPS
Ichiro (ZiPS): .390 OBP, .436 SLG, .826 OPS
Ichiro (Roto): .377 OBP, .438 SLG, .815 OPS
Ichiro (PECOTA): .355 OBP, .415 SLG, .770 OPS
I love the fact that Reed is projecting so high. PECOTA has pegged Winn to drop hard every year and he hasn’t. The system hates him, and I’m not going to buy into the fact that he’ll fall so far until I see it. I’m not saying he’s as unpredictable as Ichiro, but PECOTA hasn’t seemed to be accurate yet.
Basically, PECOTA says we have some serious trade bait. Things may not pan out with one of the outfielders and we’ll just be stuck with two very solid players and one great one. Perhaps two will fail and the outfield will be average. Not a bad place to be, right now.
I would expect PECOTA to do best with pitchers. The P in PECOTA stands for Pitching.
Nate alanysed how PECOTA did against a slew of other forecasters for the 2003 season about a year ago. Here’s his article:
http://baseballprospectus.com/article.php?articleid=2515
Comparing large samples like that isn’t a meaningful method of determining how accurate a system is that predicts on an individual level.
Consider:
Actual: (5, 5)
System 1: (4, 4)
System 2: (1, 9)
System 2 hit the average spot on, but which did a better job of predicting individual values?
Does anyone know what Beltre’s PECOTA predictions are? I would also like to know Pinero’s too.
On a side note, I was looking up some righty and lefty splits for Ichiro! and some others, and here’s what I came up with.
2001: vs left .318, vs right .362
2002: vs left .356, vs right .308
2003: vs left .359, vs right .291
2004: vs left .404, vs right .359
At least given these stats it might actually be fun to see the M’s play Randy Johnson…
But anyways, is there a reason for Ichiro!’s sudden turnabout from 2001 to 2002? And is it common for a lefty to hit better against a lefty?
As far as switching about to which type of pitcher you like to hit, Bonds is another one who’s done it:
2001: vs left .312, vs right .334
2002: vs left .384, vs right .363
2003: vs left .363, vs right .331
2004: vs left .307, vs right .395
Anyone?
Eric, you’re right. I thought of that this morning. So, I went back to the data and crunched some more numbers.
This time I looked at how far off the individual projections were and found this:
For OBP, the 4 projection systems were nearly identical:
Forecaster: off by .024 per player
PECOTA: off by .025 per player
RotoChamps: off by .024 per player
ZiPS: off by .025 per player
For SLG, there was a wider disparity:
Forecaster: off by .048 per player
PECOTA: off by .047 per player
RotoChamps: off by .055 per player
ZiPS: off by .048 per player
No system jumps out as being better than the others and PECOTA wasn’t as bad as I thought earlier.
Open threads when there isn’t any news?
The fact that PECOTA underpredicts for Randy Winn and Ichiro’s actuals, i.e. ‘hates them,’ raises the question of why, with a follow on question as well.
I’ll venture an hypothesis, (it’s neither more not less than that, and I’m sure the issue has been studied more extensively at some point) using the conundrum of Randy Winn as the fulcrum, since the issue delves into observations regarding his performance with which I have strugggled en blog heretofore. Randy Winn doesn’t _look_ as good as his actual results, either at the plate or in the field, yet as we’ve hashed over here his statistical performance tends to support the conclusion that he gets better results than his eyeball performance indicates. How?? Speed.
In the field, Randy’s skillset includes below average reads off the bat, and below average routes, yet he has the leg speed to make up the ‘gap’ that results and makes more catches, it would appear, than one expects. At the plate, Randy does not demonstrate great patience, and at his worst falls into streaks where he hacks at just about anything: his skillset does not particularly include working the pitcher for a good pitch. However, Randy manifestly has good _batspeed_, and he will have streaks where he has simply terrific batspeed. Furthermore, Randy is a fast man and can beat out leg speed hits, at times even when he’s topping pitchers pitches. I suspect that the offensive performance prediciton systems read rather accurately his poorer skills—lesser plate discipline and a lack of homerun power—and project for this but miss the fact that Randy will have ‘toolsy’ streaks where his actual performance exceeds his skill base. Or to put it another way, by this hypothesis the prediction systems predict Winn’s overall year pretty well but miss that sweet spot he has every mid-summer where his batspeed and legspeed are especially terrific and he collects a DISPROPORTIONATE share of his offensive totals, a phenomenon I have regularly observed for Winn and raised here before. This is not simple streakiness, although the prediciton systems are going to have a harder time with streaky players, another likely hypothesis.
Ichiro presents a similar problem for the prediction systems. He doesn’t walk, and doesn’t take the ball out, but has a statistically abnormal tools base in extreme bat and leg speed, so that he, too, ‘out-tools’ the predicition system. Ichiro has rather better plate discipline than Randy, significantly indicated by the fact that he’ll work deeper in the count, although largely by fouling off pitches rather than taking them. The projection systems may find it hard to see that Ichiro will flick the bat on a borderline pitch and simply outrun an extra twenty singles during his best months altotheger, boosting his totals beyond what any comparison would suggest. —And that’s right: nobody else could do that, he’s incomparable in that regard. Even somebody like Carl Crawford has to have the terrifically fast HANDS also to get the ball in play those extra times in order for his own exceptional leg speed to become a factor. Baseball performance IS multi-factor, which is why prediction is so difficult.
This overall hypothesis—tools guys beat the curve—could be tested narrowly and broadly in the course of asking a larger question. First and narrowly, do the prediciton systems, PECOTA in particular, generally predict less accurately for speed players, and if so do they tend to underpredict their actual performance? Second and broadly, do the systems generally underpredict for tools guys, and if so do they underpredict more extensively for tools guys who are also lesser skills guys, a la Winn? The latter inquiry points to a broader, and in itself quite interesting question, not simply for predicting future performance but understanding present performance. I look at Randy and do not expect him to succeed as much as he does, but a lot of what I look at are skills: do guys with better tools beat their odds more often, and so surprise me? Both logically, and in the instance of Randy Winn, the answer would appear to be yes. On the other hand, my rather dated reading of stat-based approaches has left me with the impression that it’s the skills guys who are in fact the ones who supposedly _underperform_ against predictions. It would be great to bring this entire issue into clearer resolution. At the very least, doing so would give insight into who might have more to offer, for example a skillsy Ibanez (an accurate representation, I think) or a toolsy Winn.
More specific to PECOTA, if that system is regularly underpredicting slugging percentage, as suggested by the citations in this thread, is that due to underprojecting: a) singles, b) homeruns, or c) aggregate slugging?
To correct one conjecture in the above, my past reading has suggested that it’s the TOOLS guys who underperform projections, whereas to they hypothesis there it may be that the tools guys actually _overperform_ against projections.
% of AB where Winn draws a walk: 8.45%. % of AB where Ibanez draws a walk: 8.24%. Who is skillsy vs. toolsy here?
The other point is that PECOTA does take into account speed in a very direct way, and does try and capture a player’s toolset. I’m not sure you’d be able to show a bias against tools players, because the system would pair them off with other tools players from the past and project performance based on how those historical players did. If there is a bias against toolsy guys, it would be history’s bias, not PECOTA’s.
Bobby:
Can you email me. I have the Marcel forecast heading into 2004. I expect them to do the worst of those, but I’d like to see how close it got.
Thanks, Tom
tangotiger@yahoo.com
Oh, and Marcel has Ichiro at .331/.379/.438
Cameron: .243/.330/.450
Reed: .306/.378/.461 (however, Marcel has a low reliability on him)