U.S.S. Mariner

20 Aug

Projecting Future Performance

Last week, Geoff Baker wrote a series of blog posts that dealt with the issue that has been dominating the blogosphere conversation for most of the past three months - the playing time of Adam Jones, Raul Ibanez, and Jose Vidro, and how it should be distributed. Don’t worry - this post is not about that topic. At least, not explicitly. This post is about a commonly accepted principle that was laid out very well by Baker in that trio of entries. The idea is summed up in this statement:

It’s going to be hard to keep Raul Ibanez out of the lineup now that he’s hit six home runs in nine games. Equally tough to sideline Jose Vidro now that he’s back to being a hits machine. I was all for playing Adam Jones every day when those other guys were struggling back in July. But things have changed. The veterans have stepped up and earned their playing time of late.

In July, Geoff was on board with the belief that Adam Jones would be able to help the Mariners as an everyday player, and the struggling veterans should be ceding playing time to the more talented youngster. He felt the struggles of guys like Vidro and Ibanez warrented a change, and Jones provided a superior option. He doesn’t feel that way anymore. Why? Because Raul Ibanez and Jose Vidro are hitting well recently, and Baker believes in the predictive power of the hot hand.

This isn’t a unique position. Almost everyone believes in the predictive power of the hot hand. The overwhelming majority of people in America base their future expectations - not just in sports, but in life - on their most recent experience. In sports, this is even more prevalent, as we’ve all witnessed players perform at a level far beyond what we expected them to do. Joe Dimaggio’s 56 game hit streak may be one of the most celebrated records in sports. Seattle saw Ken Griffey Jr hit home runs in eight consecutive games. Or, to bring it back to the current reason for this discussion, Raul Ibanez has seven home runs in his last 48 at-bats after hitting six bombs in his first 372 at-bats. He’s on fire. He’s swinging the bat well. Each pitch looks like a beachball. Pick your cliche`.

We all know a hot streak when we see one, even if we don’t know why they occur. There’s a debate about whether hot streaks are random fluctuation of events or an actual change in skills for a temporary period of time. I don’t even begin to know the answer to that question, and I can see the validity of both arguments. But that’s not what this post is about.

No, this post is about the predictive power of the hot streak and how that should affect our expecations. As Geoff laid out in the three linked blog entries above, the common wisdom is recent success should be a huge factor in determining playing time. Raul Ibanez is on fire (over 48 at-bats) and Adam Jones hasn’t earned his playing time (over 23 at-bats), and those performances were enough to change Geoff’s mind about who should be taking the field for the rest of the year. Getting away from that specific discussion, the issue I want to address is how much credence we should give recent performance in developing our expectations for how a player should perform going forward, even in the very near future.

And, you know me, I’m not a big fan of developing opinions on anecdotal evidence. I know there are random examples that we can cite to support any cause we want, but I don’t particularly care about that kind of analysis. I want to know what a large swath of history tells us about the predicitve power of recent performance. Do hot hitters actually perform better, even for short periods of time, once we’ve identified that they’re hot hitters?

Keep in mind - this is a statistical argument. This isn’t one of these cases where all the people who think I’m an idiot who needs to care less about the numbers can tell me to get my head out of a spreadsheet and go watch a game, because the hot streak supporters are making an argument based on numbers. All I’m doing is testing the hypothesis of whether the numbers they’re choosing to believe in actually have any meaning.

Okay, so now that the overly long introduction is out of the way, let’s look at the evidence. The best research done on this issue that I’ve ever read comes from The Book: Playing the Percentages in Baseball, written by Tom Tango, Mitchel Lichtman, and Andy Dolphin. For people who care at all about baseball statistics, The Book is a must read. These guys are among the very best researchers on baseball issues alive, and The Book is a comprehensive review of almost any question relating to statistics you’d want to see asked. While it’s not the easiest reading you’ll ever have, it still comes highly recommended.

In the second chapter of The Book, the guys tackled the very question this post deals with - do hot streaks present any kind of real information that is useful in understanding how a hitter is likely to do going forward? To test this, they pulled in every play from the 2000 to 2003 seasons and identified hot and cold streaks as the upper and lower 5% of all performances over any five game sample that included at least 20 plate appearances. The best 5% of performances went into a hot bucket and the worst 5% went into a cold bucket. That gave them 543 unique players creating a total of 6,408 “hot streaks”, and 633 players creating a total of 6,489 cold streaks. With nearly 13,000 streaks in the sample, they eliminated nearly any bias complaint you could happen to have with the study, and created a sample large enough to give us a conclusive answer - do the players who have been identified as “hot hitters” perform better than expected based on their historical averages, and vice versa, do the slumping hitters perform worse than expected in their next few games?

Without getting too deep into the statistical minutae (for that, you should buy The Book), here are the numbers (from page 56, for those of you who already own it) - for offensive performance, they use a metric called Weighted On Base Average, of wOBA for short, which essentially sums up total offensive performance and scales it to look like on base percentage. Think of it like OPS, just better, and on a different scale. .340 is average, .400 is great, .300 is bad. Just like OBP - but as a total sum of offensive production.

Average wOBA of hot hitters during streak: .587
Expected wOBA of hot hitters in 1 game after the streak: .365
Average wOBA of hot hitters in 1 game after the streak: .369
Expected wOBA of hit hitters in 5 games after the streak: .365
Average wOBA of hot hitters in 5 games after the streak: .369

As you can see, the production of the hitters in their sixth game after being identified as being hot (and hot doesn’t even begin to describe a .587 wOBA - that’s scorching), the players performed .004 better than expected if we had just used a three year average of their past performance and had no knowledge of what they’d done in their previous five games. Statistically significant? Yes, but by the thinnest of margins.

Since I’m wary of overstepping fair use and giving away too much copyrighted material, rather than spelling out the actual numbers of the cold hitters, I’ll tell you that the result in basically the same on the opposite end - the players performed worse than expected by an ever so tiny margin immediately after a five game super slump. They also re-ran the data over a seven game sample and looked at the performance in the following three games after being identified as hot or cold and found the numbers consistent with the five game samples.

But, I know, there will be some protests about how not all hot streaks are the same, and averaging 543 players together will be unfair to those who were really, truly hot. Thankfully, the guys included a list of the 10 hottest hitters over a seven game stretch. Marcus Giles had the most success run, going 18 for 25 with 7 extra base hits from July 25th through July 29th of 2003, good for a .720/.731/1.160 line. 18 for 25! His next 5 games? 0 for 4, 2 for 4, 0 for 4, 2 for 3, and 0 for 4, a grand total of 4 for 19 and a .211/.348/.368 line.

Giles was not alone. Of the ten hottest hitters from 2003, nine of them then proceeded to hit worse than expected (again, based on historical averages and ignoring the recent hot streak) in their next three games, with only Magglio Ordonez bucking the trend and continuing to hit well. From July 20th through July 24th, Ordonez went 13 for 19 with seven extra base hits, then went 12 for 20 with five more extra base hits in his next five games. That gave him a 25 for 39 stretch where he ran an 1.850 OPS over 46 plate appearances and is one of the best runs in recent baseball history. From July 31st through August 3rd, Ordonez followed this 10 game hot streak with an 0 for 14 series of hitless games, and in the 47 plate appearances (spanning 11 games) after we could identify him as one of the hottest hitters in recent memory, Ordonez hit .244/.340/.366.

The first sentence of the conclusion of the chapter, quoted from The Book:

Knowing that a hitter has been in or is in the midst of a hot or cold streak has little predictive value.

Historical evidence suggests that knowing that a player is on fire should do essentially nothing for our expecations of what he’ll do going forward, even in the very near future. In fact, given the choice of being totally ignorant of recent performance or knowing exactly how each player performed in a small sample, you would, in almost every case, be better off being totally ignorant. The natural tendancy to overstate the value of the predictive power of the hot streak (or cold streak) outweighs the sliver of actual useful information that is included in hot streak analysis. Because of our own biases, we’d make more correct decisions if we had less data.

Of course, the ideal isn’t to have less data, but to understand our biases and compensate accordingly, allowing us to live in a data-filled world and still make optimal decisions as often as possible. That’s part of what we’re trying to do here, and what statistical analysis does a good job of explaining - identify where human error leads us to drawing conclusions that are unsupported by the realities of life.

Going back to the Mariner-centric discussion that started this all, we have the Raul Ibanez/Adam Jones situation. If you, like Geoff Baker did, believed at the end of July that Adam Jones was a better player than Raul Ibanez and should be taking the field everyday, then nothing that has happened on the field since then should change your opinion. Raul Ibanez isn’t any more likely to hit well tonight than he was three weeks ago. His expected performance should be, for all intents and purposes, exactly the same. Whatever you thought of him on July 31st, you should also think of him now.

History paints a clear picture. Again, quoting from The Book (page 45):

One of the running themes of this book is that, very frequently, fans and analysts make too much from too little.

This is an important bias to keep in mind when performing any kind of analytical exercise. Our natural emotional reactions lead us to overvalue what has happened recently, and too often, we draw incorrect conclusions about what is going to happen based on things that have little or no real predictive value.

I actually have a lot more to write on the subject of correct player evaluations and projections (including talking about longer hot streaks, such as Jose Vidro’s, and how to evaluate a real change in performance), but for time and space reasons, I’m going to have to make that a post for another day.

Before I go, I’m going to make a request - please don’t turn the comments into another chance to rehash the same old argument we’ve been having for the last three months in the comment threads. If you feel that Ibanez should be starting due to clubhouse chemistry, veteran experience, or if you never felt that Jones was better than Ibanez, that’s fine - that’s also not what this post is about. The topic is about the predictive power of hot and cold streaks. I’ll be a much happier author if that’s what we talk about in the comments.

277 Responses to “Projecting Future Performance”

  1. 1
    JeffnBham said:

    Thanks Dave.

    Based upon your recent hot streak of well-researched columns I predict your next one will be a grand-slam as well.

  2. 2
    Username said:

    Thanks Dave for another great post; your content is some of the best and original I come across on the web (both sports-related or not-sports related).

    I look forward to more information of when a possible real change in performance has/is occurring.

  3. 3
    Jeff Nye said:

    Awesome post, Dave.

    It always amazes me that people give so much credence to “hot” and “cold” streaks; at the base, it all boils down to understanding the importance of sample size, and that seems to me to be a fairly easy concept to “get”.

    I’m looking forward to the post about how to tell the difference between a “hot streak” and a true change in expected future performance; I think a lot of the confusion comes from people not being able to differentiate between the two.

    My own personal thought is that you need a pretty substantial sample size before you can start saying that someone has “turned it around”, on the order of a half season if not more. Two weeks doesn’t tell you anything.

  4. 4
    dahut said:

    One question that comes to mind is how did the authors define the extent of a hot/cold streak. A simplistic rebuttal is to say that by defining the end of a hot streak of course the next series of at bats will have regress to the mean. It most likely that the authors looked at the question in more detail that you didn’t go into.

    But I think the problem is that conventional wisdom is so strong in this case. It would take a supremely confident manager to buck this prevailing pyschology.

  5. 5
    mmccall said:

    Dave,

    What if Raul’s current performance can be attributed to recovery from injury or the correction of bad habits developed as a result of compensation for injury? Do you think a distinction can be made between what is seemingly a random spike in performance and increased performance having an underlying cause?

  6. 6
    AomoriMariner said:

    Thanks for a great post, as always.

    Don’t both fans and managers seem to already understand this at some level when it comes to cold streaks? If an All Star player goes into a slump during the season, we acknowledge it as a slump and don’t expect that the player will continue to struggle for the rest of the year.

    Something, then, about success makes us less willing to accept a streak for what it is. Of course, during a hot streak Player A might indeed be able to carry a team for a few games and his bat needs to be in the lineup. Once the streak is over, how does a manager make the fans forget recent events and play the numbers? Or is that just part of why the manager collects his substantial paycheck to put the best possible team on the field each day?

  7. 7
    zzyzx said:

    “Whatever you thought of him on July 31st, you should also think of him now.”

    I don’t quite buy that. I couldn’t find stats for July 31, but I did a rough calculation (multiplying his monthly pre-August splits by at bats, adding them, and dividing that by all pre-August at bats) to get his OPS on 7/31 - .694. His current OPS for 2007 is .787.

    So IMO we should judge Raul differently on 8/20 than on 7/31 because we have more information about the 2007 season. Sure, there’s the risk of overreacting to his 1.385 August OPS, but how much of the call for Raul to leave was an equal overreaction to his .503 July OPS?

  8. 8
    marc w said:

    A very nice post, and anything referencing The Book is cool with me.
    But I’m not sure that this “This isn’t one of these cases where all the people who think I’m an idiot who needs to care less about the numbers can tell me to get my head out of a spreadsheet and go watch a game, because the hot streak supporters are making an argument based on numbers” is true.

    Some people may make a numbers-based, ‘extend the trend line’ argument based on the numbers. But others, I think, are making a management argument; that players would balk at seeing guys who’ve ‘carried the team’ benched in favor of a rookie. We could trot out all the stats we wanted, but many players might see that move as capricious. So - we’ve now left the realm of stats behind, and there’s not a whole lot of meaningful argument to be had about something like that. But I think it’s a big part of why someone like Baker is saying what he’s saying. Could be wrong…

    Second, something about this: “Whatever you thought of him on July 31st, you should also think of him now” seems odd to me. Is this… results-based analysis?

  9. 9
    Mike Honcho said:

    zzyzx - You are missing the point. Each of the hot streaks identified by Tango, et. al., resulted in a higher OPS for each particular player. What their research showed is that it should not be expected that a player’s production will be higher from the hot streak on out than it was before the streak.

  10. 10
    Ben Ramm said:

    Since I became aware of these types of studies over a decade ago, the interesting topic has not been reconfirming them. Instead, I find it fascinating why people resist this data so vigorously. Do you really think that Geoff Baker would actually care about this kind of evidence? He must have encountered it at some point. The initial Tversky study of the “hot hand” in basketball is almost 25 years old. Baker is far more interesting than Finnigan, but he still considers certain pieces of evidence relevant without regard to whether any evidence suggests that the evidence is relevant. That is, going 7 for 10 is evidence (of something), but little evidence supports that it is relevant to predicting what will happen on the 11th at bat. So, if you’re going to respond to Baker, or even if you’re going to make a general point, what about explaining why people refuse to accept information like the information you’ve presented here?

  11. 11
    scraps said:

    There are also a bunch of people who claim their argument isn’t just based on the recent numbers, and say that you can see how much better the guy on the hit streak is swinging the bat, aren’t you watching the actual games, etc.

  12. 12
    Manzanillos Cup said:

    I think a lot of us here have experienced being “in the zone” on a baseball field in high school or college. It seems like a very real experience – so I can see why most people have a tough time dismissing it as random - and I would have had a big problem being benched when I thought I was in the middle of one. However, the key is that even if the “zone” is real, it doesn’t appear to be predictable or permanent.

    As a player, I remember that real upgrades in performance seemed to present themselves much differently than a hot streak. I remember lots of times when I would get worse (as I tried some new technique) before I got better…

  13. 13
    Dave said:

    My own personal thought is that you need a pretty substantial sample size before you can start saying that someone has “turned it around”, on the order of a half season if not more. Two weeks doesn’t tell you anything.

    One of the things that I’ve been gravitating towards for a few years, and am now firmly in the boat of, is evaluating changes in skills rather than results. I believe that any sustainable deviation in results will be the biproduct of a change in skills, and through a better understanding of what statistics to look at as well as quality scouting information, I think we can identify skills changes that will allow us to see what players have actually improved, rather than which players are just riding a nice wave.

    What if Raul’s current performance can be attributed to recovery from injury or the correction of bad habits developed as a result of compensation for injury? Do you think a distinction can be made between what is seemingly a random spike in performance and increased performance having an underlying cause?

    Sure - the injury factor is one of the main legitimate causes for deviation in performance. However, I think we need to be careful in just randomly assigning the injury excuse to any player who suddenly sees a change in performance. What changed with Raul Ibanez on August 3rd that wasn’t true on August 2nd, for instance? Or how is his claim that he’s finally healthy this time any different from the one he made after hitting two home runs in Cleveland back in June?

    We’ve consistencly cited Adrian Beltre’s near-death experience after having a botched surgery to have his appendix removed in the winter of 2001 as a factor in his poor performance in 2002 and 2003. I think, in that case, there’s a legitimate medical reason to point to, and a definitive date of when the change in his status took place.

    With Ibanez, or players like him, all we have is a randomly selected endpoint of when his performance changed. We’re retroactively deciding when he “got healthy” based on when he started hitting well. That’s backwards, and it’s not something we should be in the business of doing.

    So IMO we should judge Raul differently on 8/20 than on 7/31 because we have more information about the 2007 season. Sure, there’s the risk of overreacting to his 1.385 August OPS, but how much of the call for Raul to leave was an equal overreaction to his .503 July OPS?

    We’ll cover this more in the projection post, but basically, if you’re allowing 48 at-bats to significantly alter your projection, you’re overvaluing current year data in lieu of prior year data.

  14. 14
    HamNasty said:

    As usual, great post Dave. I find myself in the middle of both arguments swaying back and forth. I look back to my days playing sports and how it related.

    I am looking forward to that post about when a hot streak turns into a performance change and evaluating it very much!

  15. 15
    Dave said:

    But others, I think, are making a management argument; that players would balk at seeing guys who’ve ‘carried the team’ benched in favor of a rookie.

    I should probably go back and put an addendum in the post, because this is something I meant to cover before I ran out of time - this post really has nothing to do with whether Ibanez or Jones should be playing. That will be covered in the projection post. I understand the inherent problems with benching the hot hand and how that could be a political nightmare, and I think that should, to some extent, be a factor in the decision making process.

    But, for now, all I’m taking issue with is the idea that the recent surge in performance should drastically alter our expectations of what we’re going to see going forward. I’m not dealing with the playing time issue specifically here.

    Second, something about this: “Whatever you thought of him on July 31st, you should also think of him now” seems odd to me. Is this… results-based analysis?

    Well, the assumption in the statement was that you shouldn’t be evaluating Raul only based on his 2007 data through July, either. Any projection worth its salt will factor in multiple years of data.

  16. 16
    Uncle Ted said:

    What would you say if the argument for Raul went something more like this. “It’s not that we believe in the hot streak, rather, we think that his recent explosion is evidence that his apparent demise was merely apparent and now we believe that he will be the .280/.350/.470 (pecota) hitter we thought he’d be. That combined with defense may or may not be better than Jones, but we think Raul’s projection is a safer bet than Adam’s. Moreover, half a season is too small of a sample to count the guy out.

  17. 17
    Seth said:

    Isn’t this sort of like the stock market? I mean, you understand that stocks (like batting averages) fluctuate over time, so you leave your money in the whole time so you don’t miss the hot streaks.

    You can’t predict from day-to-day whether a stock will go up or down.

    Seems to me that (for better or worse) the M’s are treating Raul Ibanez and Richie Sexson like blue-chip stocks, that they expect will–over time–perform well. And in the case of Ibanez, at least, it’s paid off.

  18. 18
    Dave said:

    However, the key is that even if the “zone” is real, it doesn’t appear to be predictable or permanent.

    That’s a much better summary statement than the one I put in the post.

  19. 19
    Dan W said:

    So, perhaps a more reasonable projection for Raul is something akin to his most recent 3 year period, during which his OPS was approximately 816 (sorry I do not know how to quickly find a more precise 3 year split, but this is in the ballpark) as opposed to his ridiculous numbers over the last couple of weeks.

  20. 20
    zzyzx said:

    7 - You’re kind of missing my point though in that it’s unfair to throw out the hot streaks but not do the same argument for the cold ones. On July 31, I think we were judging Raul more harshly than we “should” have because his July was so bad.

    11 - “We’ll cover this more in the projection post, but basically, if you’re allowing 48 at-bats to significantly alter your projection, you’re overvaluing current year data in lieu of prior year data.”

    Looking at prior year data would make the .787 OPS more realistic than the .694, as that’s a lot closer to his past few years. IMO a lot of the Adam Jones buzz came out of fear that Ibanez had fallen off of the aging cliff, and that was an overreaction to July.

    I’d rather have Jones out there because he’s probably at least Ibanez’s equal as a hitter and a far superior defender, but that is a different argument than when it looked like Raul had also lost it as a hitter.

  21. 21
    51isMYsavior said:

    I think many are missing the point. Correct me if I’m wrong, but I think the entire goal of this post is to illustrate the idea that just because someone is 7-10 doesn’t mean they are more or less likely to hit or get out on the 11th at bat (thanks #8).

    Indeed, the law of averages does NOT change because someone has hit safely in 7 of his last 10 at bats.

    If you’re flipping a coin, you have a 50% chance to flip heads and tails. No matter if you’ve flipped 10 heads in a row, the 50% still stands.

    Ibanez is hitting .287 lifetime. That’s 1133 hits in 4003 at bats. Analyses MUST use this sample size in determining the likelihood of a hit, not the last 48 at bats.

  22. 22
    Spanky said:

    Wow!! You have rocked the foundations of my baseball soul! I always knew that a “hot streak” ends at some point and that people have a tendency to over-project the hot steak beyond the time that statistical data shows the hot streak is over. But this is amazing.

    Question: Would you conclude from your research that 5 games is a typical length of a hot streak? What would be considered the mean of a hot streak?

  23. 23
    Dave said:

    Ibanez is hitting .287 lifetime. That’s 1133 hits in 4003 at bats. Analyses MUST use this sample size in determining the likelihood of a hit, not the last 48 at bats.

    Actually, most data beyond 3+ years has been shown to have no real value either. It doesn’t really matter what Raul Ibanez did in 1998. Adding in career numbers is often just as misleading as adding in recent small samples.

    I really need to finish the projection post, eh? A lot of these subjects will be covered in that.

  24. 24
    HamNasty said:

    Dave when you said “Because of our own biases, we’d make more correct decisions if we had less data.”

    IS that falling into the same theory of betting on the NCAA tournament with the secretary who never watched a game in her life wins the pot?

    In coaching it is somewhat true, if you have enough of a idea about your personnel but left the biases at the door then I think you are right on.

  25. 25
    Dave said:

    Seems to me that (for better or worse) the M’s are treating Raul Ibanez and Richie Sexson like blue-chip stocks, that they expect will–over time–perform well. And in the case of Ibanez, at least, it’s paid off.

    Right - the M’s are strong believers in track records, and fervently believe that players will perform just as they did the year before until proven otherwise. The problem, however, is that because they don’t understand player aging curves, this leaves them in the position of being first hand witnesses to the collapse of guys who are really just done as major league players.

    The assertion that Ibanez was just the next in the long line of guys who fell off a cliff was premature, certainly. We were overvaluing his 2007 performance in calling his career over. Mea culpa.

  26. 26
    zzyzx said:

    Dave - I’m definitely interested in reading that. To try to paraphrase where I’m coming from, part of my reaction to Ibanez is a note in the back of my head saying, “Raul is 35, players at that age can suddenly lose their skills.” A 100 point drop in OPS might just be random variations with small sample sizes or it might be the beginning of the end.

    If you have insight into how to tell the difference between the former and the latter, I’d definitely be interested in reading it. Hopefully, I’d be able to follow it too.

  27. 27
    rrose said:

    8

    I am reminded of the 19th Century philosopher Herbert Spencer, of whom it was said that the definition of tragedy was “a beautiful theory killed by an ugly fact”. Alas, there is no shortage of people who share Spencer’s notion of the tragic, in any realm of knowledge or human interest.

  28. 28
    Safeco Hobo said:

    From seeing the games, I doubt anyone would argue Raul has been hitting the ball better during the recent ‘hot’ streak. However, i would be curious to know statistically (if at all possible) how much other factors that the batter cannot control contribute to their hot or cold hitting streaks.

    Some of Raul’s big hits came in weather conditions that helped the ball carry, the ballparks he was hitting balls out were very forgiving (ie: chicago), and some of the pitchers were giving up meatballs and Raul just capitalized.

    The question is, when do we actually look at the numbers from the past games and realize it might not just be Raul seeing beach balls but other contributing factors helped. So management might realize an upcoming game in San Diego angainst Randy Johnson may be a good time rest Raul, or at least not bat him 4th, just because he is on a 15 for 20 hot streak against the White Sox and Tampa. (I know Randy doesn’t play in San Diego, just thinking of a tough lefty in a tough hitting park)

  29. 29
    Mere Tantalisers said:

    I think its somehwat unfair to use the top and bottom 5% of performances to show that the production level is unsustainable. Over a short stretch like five games the outliers will certainly be farther out, and certainly the level of production would be unsustainable.

    I think a more insightful question to ask might be ‘how likely is another hot streak after the one initially identified?’ or how soon to follow? I don’t disagree with what you’re saying, Dave, not at all. I just wonder if the findings would be the same if the questions asked were slightly different.

    A batter’s line over a season is not built linearly, as we all know. It is a compound of hot and cold streaks (though perhaps more for some than for others) and in this case the argument could be made that Ibanez recent run is a regression to the performance expected of him based on his last three years. In that case it is not so much a hot streak as a pendulum swing…

  30. 30
    Uncle Ted said:

    Dave, can you say something about expected reliability of projections in that post? It seems to me that if you have two hitters one of whom has a slightly higher median projection but has a much wider range you might at this point in the season go with the safer bet. Of course this all depends on whether you need to make up ground and how much, and if you need to make up lots of ground then you’d increase your post season expectations by going with the riskier option. I know nothing about whether Raul or Adam fits either of these descriptions, nor do I understand reliability rankings like Pecota’s, but I’d like to.

  31. 31
    Mike Honcho said:

    Question: Would you conclude from your research that 5 games is a typical length of a hot streak? What would be considered the mean of a hot streak?

    Good question - I was wondering this too. Does the analysis change if the hot streak is a more prolonged “warm streak” - which is probably what we are seeing from Vidro?

  32. 32
    rrose said:

    ooops, #27 was a response to something in #10. Sorry for the error.

  33. 33
    JLC said:

    I agree that “being in the zone” is a real phenomenon, and that going into and coming out of it are not predictable.

    The concept is a little more complex in baseball, where hitting .500 means failing half the time. Given that, I can understand the hitter’s and the viewer’s tendency to assume the hot streak is lasting longer than it actually does.

    The other human tendency is to see patterns, even when there are none, and that screws us up all the time.

  34. 34
    Aaron said:

    Re: recent performance as a predictor of the future.

    Just for the sake of discussion, can’t you turn that same argument around on yourself to some extent, though?

    Say the team (or Baker) believes that Vidro and Ibanez have been valuable major-league players for years. Whatever skills they have displayed are the skills valued by team management (they did acquire both players on purpose, after all).

    And for the first 3 months of the season, they both hit a cold stretch, rather than displaying actual declining skills countered by a recent hot streak. If that’s the case (though I doubt it, and I’m sure you guys do too), then a turnaround isn’t just a hot streak, it’s a return to an actual display of skill.

    So if the “cold stretch” was a reason to reduce their playing time (Baker was aboard that bandwagon, and the callup of Jones indicates that management was as well), then why DOESN’T a hot stretch negate that?

    In other words, if USSM was on board with reducing ABs for the cold guys, isn’t that the flip side of the same argument for maintaining ABs for hot guys?

    Certainly there’s an argument for playing Jones just based on defensive aptitude, so I’m not saying he should sit, but getting on the team’s back for playing the hot hand when you guys were driving the wagon to get the cold guys run out of town (figuratively, of course) just screams to me, “I’m missing something here!”

  35. 35
    pygmalion said:

    I’m glad to see the data on this. I’ve always felt it to be true, on analogy with cold streaks. As a poster pointed out, everyone accepts that cold streaks by an all-star as anomalous. So hot streaks must be the same, one would reason. Glad to have the evidence of it.

    I think that you should move the point that we ought always to have been depending on three year data to the post. It makes a significant difference to how we should understand your statement that we shouldn’t judge Ibanez differently today than on July 31st.

  36. 36
    OscarM said:

    Isn’t part of the the problem that streaks, hot or cold, are by definition statistical anomalies. All anomalies are not predictable or sustainable. This gets compounded by someone like McClaren because when a streak suddenly matches his expectations he overvalues it.

    Little Mac wants his veterans to succeed because he likes them. Geoff Baker also likes them and wants them to succeed. Mariner fans in general want them to succeed. It becomes a perfect storm of self-gratification.

    You can produce these statistical arguments all you like. I would be shocked if Ibanez, in particular, gets any significant downtime now even when the “streak” ends, if it hasn’t already.

  37. 37
    brian_sun said:

    There are 41 games left this season. That’s roughly 140 AB that a full time starter will have. I don’t think what Raul did in his last 48 AB or what Vidro did in his last 115 AB warrents riding them out for the entire 41 games. Richie Sexson is a different story, since the guy hasn’t had ANY hot streaks in the first 121 games. The entire reason he’s still playing is because his 14M salary this year and next year. These last 41 games basically make or break your season. You can’t afford to wait for these veteran players to come out of their slumps when they get into one. I think you divide the 41 games into 4 10 game intervals. You can’t allow a guy to be cold for more than 10 games. Raul and Vidro have performed well, let’s give them the next 10 games. Richie Sexson has sucked for the first 120 games, stick Ben Broussard in 1B for the next 10 games…

  38. 38
    Dan W said:

    I too look forward to the projection post. “Hot streaks” abound! Besides Vidro (sizzling) and Ibanez (scorching), Betancourt is 377/397/689 (en fuego) in August!. With 2 BBs!. Stay positive!

  39. 39
    Seth said:

    I guess now that we’ve accepted that Ibanez was just having bad luck…what about Sexson? Is he deviating to the norm, or is he kaput? Or what?

    I mean, it’s not like there aren’t dozens of power hitters who have kept truckin’ after 32–Papi, Chipper Jones, Mags Ordonez, all slugging over .550 this year.

    Then again, Sexson falls into that “bad athletic skills” category like Alvin Davis and Don Mattingly, right?

  40. 40
    Tek Jansen said:

    Dave, or anyone else who might be interested in answering, is there any way to predict future defensive performance. A lot of the clamoring for Jones to be put in LF rested in his defensive value. Do people see Raul’s catch on Saturday and the the Jones drops as predicitive of future defensive performance?

  41. 41
    Dave said:

    guess now that we’ve accepted that Ibanez was just having bad luck…what about Sexson? Is he deviating to the norm, or is he kaput? Or what?

    Again, we’ll talk about this in the projection post, but that’s not what we’re accepting.

    We’re accepting that park of his decline was bad luck, but part of it was also an obvious (and totally predictable) age related decline in skills. If you look at the community projections for Ibanez from before the season, we all saw him taking a pretty significant step back from his 2006 season. There’s no reason to expect him to hit at his prior year levels.

  42. 42
    rrose said:

    34

    The cases against Ibanez and Vidro weren’t based on recent drop-offs in performance (”recent” defined here as 2007). The primary issue with Ibanez in particular has been his liability in the field.

  43. 43
    Aaron said:

    To expand on my #34 (because I’m never as clear as I’d like to be):
    “If you, like Geoff Baker did, believed at the end of July that Adam Jones was a better player than Raul Ibanez and should be taking the field everyday, then nothing that has happened on the field since then should change your opinion. Raul Ibanez isn’t any more likely to hit well tonight than he was three weeks ago.”

    But what if we only thought Ibanez was done because of this very same “recent performance” bias? Using the very same analysis you laid out, somebody could make a (weaker, but perhaps valid) case that a couple hundred ABs early this season shouldn’t diminish his 3-year average enough to come to the conclusion we all came to.

  44. 44
    Dave said:

    Dave, or anyone else who might be interested in answering, is there any way to predict future defensive performance. A lot of the clamoring for Jones to be put in LF rested in his defensive value. Do people see Raul’s catch on Saturday and the the Jones drops as predicitive of future defensive performance?

    It’s all about repeatable skills. Range is a very repeatable skill. Jones’ drops and Ibanez’s catch should do absolutely nothing to change your evaluation of their respective defensive abilities.

  45. 45
    Bernoulli said:

    I wrote up a nice long comment about Kevin Mench’s shoes, but it doesn’t look like it needs to be posted. Great job, Dave, and thanks for writing this.

  46. 46
    Jeff Nye said:

    Personally, I still have the feeling (read: I make no claims of having supporting data) that Ibanez is teetering on the edge of the cliff.

    It’s interesting to me how really invested people are in the whole “hot streak” theory. I’ve seen many claims of “anyone who actually watches the games can obviously see that Player X is hitting the ball harder”, that don’t seem to be supported by any real scouting information.

  47. 47
    rsrobinson said:

    If there was some kind of meter or crystal ball to determine when a hot streak ended I guess you could make an argument to bench Ibanez or Vidro when that point was reached. It still doesn’t make much sense to do it while they’re hitting well.

    USSM conventional wisdom on Vidro has been wrong now for five weeks and counting. Perhaps today is the day it starts being right, but who knows? What I do know, though, is that probably every person in that Mariners clubhouse believes that both Vidro and Ibanez are swinging the bat well and deserve to be in the lineup.

    You can’t take the human element out of baseball. You can’t bench players who are hitting well because of some statistical theory that provides an educated guess about how they MIGHT perform in the future. You do that and you leave players uncertain of what they have to do to earn playing time and can wreck the morale of the team. If players can play their way out of the lineup by poor hitting (unless your name is Richie Sexson) then players have to know that they can play their way into the lineup, too. It’s been that way in baseball for 100+ years and will probably that way for another 100 years.

  48. 48
    Dave said:

    Way to just totally miss the point of the whole post and ignore the final paragraph to boot.

  49. 49
    BLYKMYK44 said:

    - So as a manager how would this data be used in a practical purpose?? While it may be true that a hot streak is not predictive of the future there has to be some sort of reward for performing well…

    - Also, does that mean there are significantly less 10 game “hot streaks”…at what point should a fan consider a hot streak an actual reflection of probably future performance?? I would consider Adrian’s great 48HR year…for awhile it was just a “hot streak” and it just kept going and going. When does that transition actually happen?

  50. 50
    Jeff Nye said:

    To address something that keeps coming up (that Dave touched on but I think warrants repeating):

    The reason that the early-season “cold streaks” by Ibanez and Vidro are given a bit more weight, is that they fit with what we know pretty well about how baseball players’ skills decline with age.

    “Hot streaks”, on the other hand, are inherently a deviation from a player’s normal aging curve, and thus should be viewed with more skepticism than a “cold streak” that is more likely to be the onset of skill decline in an older player than an isolated instance.

    The same thing can be applied to younger players that go on a “hot streak”; a short-term improvement in their skills is more likely to be a harbinger of them increasing their skillset than it is to be an isolated instance.

  51. 51
    Dave said:

    So as a manager how would this data be used in a practical purpose?

    A good manager would never start Raul Ibanez against a left-hander, much less hit him cleanup, simply because he’s “on fire” right now.

  52. 52
    scraps said:

    rsrobinson, the human element has been acknowledged over and over again. It’s been acknowledged right here in this thread, by Dave and others. We agree that it is at least difficult to sit Ibanez or Vidro while they are streaking.

    Do you understand yet what people have been trying to tell you about hot streaks not having predictive value? It doesn’t seem like it, since you continue to say that USS Mariner “conventional wisdom” has been “wrong”. That makes as much sense as saying that someone is wrong if they tell you a guy is a bad hitter and the guy hits a home run. What Dave is writing is analysis. “Conventional wisdom” is in fact what you keep espousing.

  53. 53
    Bernoulli said:

    So as a manager how would this data be used in a practical purpose?? While it may be true that a hot streak is not predictive of the future there has to be some sort of reward for performing well…

    Why? Because they won’t play well unless they’re given a cookie?

  54. 54
    Goob said:

    Great post, Dave. I’ve got a question for ya though. You sometimes hear that players perform better when their job is on the line, kinda similar to the old “it was a contract year” argument used by some to explain Beltre’s 2004 performance. Do you know of any study that compared before and after results of struggling players who had a prospect or new acquisition come in that threatened their playing time?

    You hear it a bunch in football when a veteran QB is brought in to challenge the rookie in an effort to “raise his playing level.” A few buddies of mine say this is why Ibanez and Vidro are suddenly playing better, because they know if they don’t, Jones will take their spot. It’s maddening listening to them! But I guess it’d be interesting to see if there’s any credence to it.

  55. 55
    Manzanillos Cup said:

    USSM conventional wisdom on Vidro has been wrong now for five weeks and counting.

    You mean the consensus here that Vidro is basically a singles hitter who draws a few walks but has no value in any other facet of the game? Hmmm, don’t see how that’s been wrong.

  56. 56
    rsrobinson said:

    Alright, predictive power makes sense when rolling dice because, given enough rolls, you can predict within a very narrow degree of probability how often each number comes up. When it comes to humans projecting future performance is basically just an educated guess because there are too many unknown variables involved.

    A hitter might actually be performing better, not because he’s hot, but because he’s made mechanical changes in his swing that improves performance. Or he’s recovered from injury and is healthier. Or he’s simply improving over time to more fully reach his potential.

    And is there anything in that study that measures the length of a hitter’s streak? If one guy is hitting well for 20 games, another for 30 games, and another for 40 games then the study does little to accurately predict future performance after either 20 games or 30 games.

  57. 57
    zzyzx said:

    50 - the problem with that though is that it could easily create models where we overestimate young players and underestimate old ones. At some point every old player will eventually lose the skills that let them play in the majors, but we have to be careful that we don’t spend so much time looking for that that we overreact ourselves.

  58. 58
    CCW said:

    STRAW MAN ALERT: “Baker believes in the predictive power of the hot hand.” Did Baker say that? I don’t think it’s true. What Ibanez’s recent hot streak has demonstrated is that he is probably not as cooked as we - you, me, Baker, the USSM community - thought he was.

    Honestly, didn’t you think there was a pretty good chance that Raul had completely fallen off the cliff, never to return? Me, too. He looked *awful* at the plate. And with the distinct possibility that Raul was truly cooked, it made a lot more sense to give Jones his at-bats. But the past 50 at-bats have changed that belief. I now think it’s less likely that Raul will post a .687 OPS going forward than I did previously, and that is perfectly logical, rational belief. Dave, if you honestly think of Raul right now the same way you were thinking of him 50 ABs ago, I would be surprised. And I’d argue that doesn’t make any sense.

    Anyway, I don’t disagree with your point about hot streaks and cold streaks, but I think, as it relates to the M’s and Baker’s posts, it is a straw man set up to refute an argument that no one is really making.

  59. 59
    Jeff Nye said:

    zzyzzxxzzrrrr (sorry, just poking fun):

    That’s a fair caveat, and I think it’s pretty obvious that none of what we’re talking about here is meant to be 100% exact (well, obvious to everyone but one person), but I’d definitely prefer to err on the side of overvaluing young players than old ones, because in the main you’re more likely to be right when you project a sudden uptick in performance for a younger player than an older one.

    There’s been a lot of good research done on aging curves for baseball players, and everything I’ve seen indicates to me that it’s a lot more likely that someone will “figure it out” at 22 than it is at 38.

  60. 60
    gwangung said:

    A hitter might actually be performing better, not because he’s hot, but because he’s made mechanical changes in his swing that improves performance. Or he’s recovered from injury and is healthier. Or he’s simply improving over time to more fully reach his potential.

    So, how do you tell?

    Alright, predictive power makes sense when rolling dice because, given enough rolls, you can predict within a very narrow degree of probability how often each number comes up. When it comes to humans projecting future performance is basically just an educated guess because there are too many unknown variables involved.

    Except, you just AGAIN ignored what Dave posted.

    Given the data there, those “unknown variables” DON’T MAKE A DIFFERENCE in future performance. That’s an empirical observation.

  61. 61
    lailaihei said:

    Hey Dave, what predictive stats do you like to use? You say 3 years max, is that a .5/.3/.2 split or what?

  62. 62
    Steve T said:

    Thanks for a great post, Dave. Humans love seeing patterns where none exist; it’s one of our defining characteristics. So is the ability to actually do the research to correct overinterpretation. Sometimes the animals in the clouds are just clouds!

  63. 63
    fetish said:

    [no]

  64. 64
    davepaisley said:

    13 - “if you’re allowing 48 at-bats to significantly alter your projection, you’re overvaluing current year data in lieu of prior year data.”

    So the fact that Ibanez’ last three years (his second career in Seattle) are:

    2004 .825 (month splits from .645 to 1.167)
    2005 .792 (.688 to .851)
    2006 .869 (.739 to 1.109)

    2007 .788 (.503 to 1.333 (Aug incomplete))

    …should show us that he’s simply regressing up to the mean (currently sitting at .788)

    He’s unlikely to maintain 1.333 for August, but should finish over 1.000 at least. So he could have the worst and best months of his second Mariner career back to back.

    Allow a little bit for age related decline and he’s well within the range of expectancy so far this year.

    There is significant evidence that Ibanez has been nursing nagging injuries (probably well known by the coaching staff). The likelihood is that the injuries will recur sooner rather than later, but he could well dodge a bullet and be productive the rest of the year.

    Vidro’s a little trickier, in that his performance boost is entirely due to hitting over .400 for an extended period of tiem - something that has proven to be impossible for anyone to sustain over the long term. So we *ought* to know that Vidro’s run is entirely luck, especially as it’s all singles.

    Actually, that would be an interesting set of streaks to look for - how long have any hitters been able to sustain a streak of .400+ hitting?

  65. 65
    zzyzx said:

    59 - don’t get me wrong, I share your bias. However, I don’t know if we know enough to know the difference between a slump and falling off the cliff. Was the probability that Raul was effectively done on 7/31 10% or 50%? If I were in a major league front office, that’s what I would be trying to research, because if you had an advantage in knowing the odds that a player was (or wasn’t) done, that could be a huge advantage.

    Hey, seeing how we root for a team that overvalues vets, it would be to our advantage to get conventional wisdom to undervalue them so we’d get better deals. Hmmmmmmmmmmmmm *rubs hands together*

  66. 66
    whwang said:

    What I read from this post is, hot streaks have very little (if at all) predictive power, comparing to the long-term average of the player’s pasr record. I totally agree this.

    However, in the specific case case of Ibanez this year, while I don’t think we should take his recent hotness too seriously, we also need to ask which part of his past record we should trust. The 05-06 Ibanez? Or the first-half-season Ibanez in 07? If the re-July 07 Ibanez performance is caused by injury, can we view the recent hot strak of his a sign of improved health and more or less expect him to perform like the pre-07 Ibanez?

  67. 67
    Jeff Nye said:

    I don’t want to get too much into specifics about certain players, because Dave asked that this thread not turn into that and it’s already veering dangeously close. (Dave, if you want to squash this post to prevent further derailing, I don’t mind)

    Suffice it to say that I always was a proponent of keeping Ibanez’s bat in the lineup, just at DH, and moving You Know Who to the bench.

    Most of the current knock on Ibanez is that he isn’t a good defender anymore, and I saw more evidence of this at the game yesterday, watching him have to expend a huge amount of effort to make a couple of plays that would’ve been routine for someone who can actually run. His bat, however, still has value.

    So, bringing it back to the topic at hand, I’m not sure that there was ever a huge argument that Ibanez’s “cold streak” was indicative of him no longer being able to hit the baseball. I think he’s still going to be a reasonably productive major league hitter in the scope of this entire year.

    Most of why many of us wanted to see AJ in left had more to do with the defensive gain than offense, and I’m personally not prepared to discount his track record of excellent outfield defense in the minor leagues based on a couple of bad plays in the majors.

  68. 68
    Chris Miller said:

    I think Raul is in clear decline, but I think part of it is good old Regression to the Mean. Sexson will probably do the same, come back, but with some decline. Sometimes a streak can last an entire year. Adrian Beltre was hot for 04 then regressed, as well Mike Lowell was awful 05, then regressed.

    Keep in mind ALL years and careers are composed of streaks that create an average. Sometimes the streak only lasts 1 PA, but overall sombody’s averages are the result of a series of streaks. Good spells and bad spells weigh each-other out. Ichiro loves to alternate between Incredible and Mediocre.

  69. 69
    rsrobinson said:

    After re-reading Dave’s post I’m still confused about whether the study is limited to measuring performance after seven games or if it only began measuring once it was identified that players had “cooled off”. I see problems with both.

    If some players cool off after seven games while others cool off after 15 or 20 or 30 games, then the study does little to predict future performance after, say, ten games.

    And what happens when a hot streak goes well past seven games, as it has with both Vidro and Ibanez. Does that mean that factors other than just being hot are involved? Or is that considered a failure to accurately predict future performance, at least over the short term?

  70. 70
    Chris Miller said:

    I think an entire season is not enough data to make a true talent judgment on, let alone a month or two. More often then not, if someone heats up way above their expected level, even for the whole year, the next year they regress. Sometimes there is a skills based component, and they don’t regress as far, but how many times has a guy just had a career year, then never did it again? Raul 06 comes to mind.

  71. 71
    Chris Miller said:

    Even if the skills have clearly changed (ie, a guy is REALLY getting wood on it, hitting the ball a mile out of nowhere), I’d still be leary until he did it for a year (or more).

  72. 72
    Mike Honcho said:

    I think the question that needs to be asked is: how Tango and Co. calculated “expected” w/OBA?

  73. 73
    Chris Miller said:

    Probably regressed statistics from before the streak, w/o looking.

  74. 74
    Chris Miller said:

    A hunch would be something akin Marcels. But I’m not them, can’t speak for them directly.

  75. 75
    ghug said:

    It is possible that Ibanez had a long cold streak (possibly due to injury), and then a hot streak, and now he will return to normal (we can hope can’t we). If you want to project performance accurately, in my opinion, you have to take into account age, several seasons of stats, and most importantly skillset. The book shows that, somewhat.

  76. 76
    BLYKMYK44 said:

    Would anybody be able to define when a hot streak becomes actual performance??

  77. 77
    Chris Miller said:

    I’d think regression and scouting are the keys to understanding change in performance.

  78. 78
    smac said:

    Dave,
    One question I have after reading this is, do you buy into hitting players against pitchers they hit well/ sitting them against pithcers they have hit poorly. i.e. Raul vs. Santana (I couldn’t believe he was in the line-up for that game, and I couldn’t believe he had such a nice average vs. Santana). Are those face to face numbers just small sample size/hot streak that should be given no credence, or would you argue that those become predictive?

  79. 79
    Jeff Nye said:

    Well, that’s the million dollar question.

    I don’t think there’s any way to draw a line in the sand saying “this is when it stops being a fluke and starts being a change in expected future performance”.

    I think Dave touched on the topic briefly earlier in the thread, though, when he said that you need to evaluate the skills rather than the results, with the implication that looking at just results will never allow you to tell the difference between those two things.

    So basically, the idea is that you ignore the results-based “hot” and “cold” streaks, and use scouting and carefully selected stats that tell you useful things about a player’s skillset to try to identify the difference.

  80. 80
    Chris Miller said:

    #78, the book covers that too, and the answer is there’s no predictive value in batter pitcher matchups either, beyond the expected results based on the batter, pitcher, handedness, and batted ball tendancies (ie, a groundball pitcher against a slow guy) of each.

  81. 81
    DMZ said:

    Generally speaking, face to face numbers have no value.

    That doesn’t mean that you can’t play matchups — Earl Weaver talks about that: if you have a guy who is particularly good at turning on fastballs, you want them up against a reliever who doesn’t have a decent breaking pitch.

  82. 82
    arbeck said:

    It seems like 90% of the questions asked in comments on the site could be answered by just reading The Book. Either everyone needs to buy a copy, or I’m going to have to start carrying my copy around so I can refer to it all the time.

  83. 83
    Dave said:

    When it comes to humans projecting future performance is basically just an educated guess because there are too many unknown variables involved.

    An educated guess is better than an uneducated guess, no? No one’s saying that our ability to project human performance is perfect. We’re just saying it’s better than anything else in use at the moment.

    Everyone’s projecting performance - John McLaren, Bill Bavasi, Geoff Baker, you - everyone. We’re just getting there in different ways, and the accuracy of our projections will be effected by the process we use to come up with our expectations. I think history shows that using statistics correctly, and understanding which ones matter and which ones don’t, will give greater accuracy than betting on intangible myths.

    If the re-July 07 Ibanez performance is caused by injury, can we view the recent hot strak of his a sign of improved health and more or less expect him to perform like the pre-07 Ibanez?

    Again, can anyone give me a reason why August 2nd Raul Ibanez was too hurt to be effective but August 3rd Raul Ibanez is Super Awesome Power Hitting Raul? The injury idea, which may or may not be true, is predicated around using performance to figure out when he got healthy, then suggesting that health is the reason for the performance. It’s classic circular reasoning.

    After re-reading Dave’s post I’m still confused about whether the study is limited to measuring performance after seven games or if it only began measuring once it was identified that players had “cooled off”. I see problems with both.

    The data identified any five game stretch (with 20+ PA) where a player perfromed at a high level, regardless of what followed immediately afterwards. I hoped the Magglio Ordonez example made this clear - the players were not selected as guys who hit well and then cooled off.

    That’s why there are 6,000 5+ game samples and 543 players - there are many instances of the same player, some of them overlapping. Magglio Ordonez’s 10 game hot streak would include six different five game hot streaks (1-5, 2-6, 3-7, 4-8, 5-9, 6-10), all of which would be placed into the hot bucket.

    And what happens when a hot streak goes well past seven games, as it has with both Vidro and Ibanez. Does that mean that factors other than just being hot are involved? Or is that considered a failure to accurately predict future performance, at least over the short term?

    Will you just never allow random variation to enter your mind as a cause for anything?

    I think the question that needs to be asked is: how Tango and Co. calculated “expected” w/OBA?

    They took all the players in the hot bucket, created an average based on their historical three year totals and weighted them by plate appearances. Without just totally appealing to authority, there aren’t too many reasons to believe that any of us here know how to conduct a baseball research study better than those three.

  84. 84
    darrylzero said:

    Dave said a couple of times that he thought Raul was probably finished as a power-hitter. I think if we’d asked him how sure of that he was, he probably wouldn’t have felt too sure, but very nervous, particularly because he knew it would take the Mariners a very long time to realize if he were really done. I was mostly in the same boat, though I have much less confidence in my own analysis than he does (for very good reason).

    He’s already said mea culpa about overvaluing 2007 with regard to Ibanez as a hitter, and I think he’s laid that out clearly and sensibly in the comments here (i.e. we should still expect some decline due to age, but we can’t just ignore last year either). But he hasn’t spelled out the skillset issue with regard to how that would play with Ibanez, which I think that has been the biggest part of what has been driving his analysis. The statistical consensus about how bad Ibanez is in the field is overwhelming, and he hasn’t hit lefties well in a long time (and not much at all since becoming a legitimately good hitter of RH pitching).

    So, Raul’s skillset is platoon DH. He’s the important half, he’s well-suited as a hitter to his home park (though not as a fielder, obviously), and if he gets back to what we might have expected from him this year before the season began, he’ll be a very very good platoon DH. I hope so. That makes him better than Ben Broussard. But not overwhelmingly better. They’re more similar players than we might want to admit. Certainly the drastically different usage pattern is a little odd, given the similarity.

    Also, fetish, I don’t know if you’re being sarcastic, but it’s worth remembering that Jones played well in center at AAA all year this year. We have a lot more data to look at than just his appearances with the Mariners that suggests that his defensive miscues will probably not continue. I’ll admit to being nervous about them myself, but we should acknowledge that it’s not a very reasonable nervousness.

  85. 85
    Chris Miller said:

    There’s lots of information to help you make those kinds of judgement. If a guy is hitting for serious average out of nowhere, and his LD% is 27%, there’s a good chance it’s a complete fluke since LD% is fairly random, and those kinds of rates are unheard of for extended periods. If HR/F is spiked, you could visit Hittrackeronline.com and view the HR’s, a buch of “Just Enoughs” would indicate HR’s are barely clearing the fence. If there weren’t that many of them and/or a bunch of “No Doubts”, and that continues for an extended period (not sure the actual regression to apply in that case), then maybe it’s not completely a fluke. If a guy alters his swing to pull it more, so he can clear a short porch at home (Raul 06?), then maybe that can be maintained for some period. I’d be weary of drawing conclusions too quickly, even in the face of visible changes, because some of the things can’t be easily regressed, which is where scouting comes in. Sometimes Changes in mechanics can lead to permanent change going forward, but sometimes people rever to old habits, or get exposed to holes and get forced to go back to what it was they were doing.

  86. 86
    carcinogen said:

    Baker has responded:

    http://blog.seattletimes.nwsource.com/mariners/

    He brings the dialog from the abstract to the specific W/R/T Sexson, Ibanez, and Vidro. I look forward to Dave’s reply.

    I would also like to comment on how lucky we are that the Ms discussion in this town has become what it has as compared to years past.

  87. 87
    Dave S. said:

    There’s a certain amount of subjectivity that has to go into this argument - when to determine whether this represents a real change in performance. And there’s also the understanding that, while the hot hand may not directly mean anything, it’s kind of difficult to bench a player when he’s producing.

    Reality is: we’ve got one day off in the next month and a half. Vidro and Ibanez have started to hit. Jones will play, because our players will definitely need rest.

    But I’m going to find it very hard to fault McLaren for rewarding production from his regulars.

    Sticking with Richie Sexson, on the other hand? Stubborn and moronic.

  88. 88
    JMHawkins said:


    My own personal thought is that you need a pretty substantial sample size before you can start saying that someone has “turned it around”, on the order of a half season if not more. Two weeks doesn’t tell you anything.

    One of the things that I’ve been gravitating towards for a few years, and am now firmly in the boat of, is evaluating changes in skills rather than results. I believe that any sustainable deviation in results will be the biproduct of a change in skills, and through a better understanding of what statistics to look at as well as quality scouting information, I think we can identify skills changes that will allow us to see what players have actually improved, rather than which players are just riding a nice wave.

    What if Raul’s current performance can be attributed to recovery from injury or the correction of bad habits developed as a result of compensation for injury? Do you think a distinction can be made between what is seemingly a random spike in performance and increased performance having an underlying cause?

    Sure - the injury factor is one of the main legitimate causes for deviation in performance. However, I think we need to be careful in just randomly assigning the injury excuse to any player who suddenly sees a change in performance. What changed with Raul Ibanez on August 3rd that wasn’t true on August 2nd, for instance? Or how is his claim that he’s finally healthy this time any different from the one he made after hitting two home runs in Cleveland back in June?

    To tie the two together, if Raul’s recent improvment is due to recovering from an injury, there should be some visible improvement in a skill that we can measure. The key is finding the right measurements. Perhaps it’s something like line drive percentage or HR/FB rate, or something that we already have access to. Or perhaps it’s average speed of the ball off the bat, or something of that nature that isn’t readily available. Despite over a century of baseball statistics, we’re still evolving new and better ways to measure results.

    My personal take is that Raul probably is recovering from at least two injuries - shoulder and hamstring - and that he is hitting the ball better due to actual improved skill. Earlier in the year, his problems seemed to be more a lack of power than a lack of ability to see or make contact with the ball. Now, it appears the ball has a little more giddyup when it leaves his bat. Shoulder + hamstring injuries and loss of power. Hmmm, it’s reasonable one might cause the other. Maybe in his earlier “recovery” he reinjured the shoulder and fell back into the hole?

    But notice the words “seemed”, “appears” and “maybe” in the above. I don’t have the right data to say “was”, “is” or “did”. I’m just guessing.

  89. 89
    Mike Honcho said:

    They took all the players in the hot bucket, created an average based on their historical three year totals and weighted them by plate appearances. Without just totally appealing to authority, there aren’t too many reasons to believe that any of us here know how to conduct a baseball research study better than those three.

    Thanks. I was curious as to how far back they went.

    Geoff Baker’s response to this post is up. He says he agrees with Dave, but then completely misses the point when he makes his argument for keeping Sexson in the lineup. And he does the same to a lesser extent for Ibanez and Vidro.

  90. 90
    Chris Miller said:

    #81, good point, it’s kind of what I was getting at (but not specifically), platoon situations.

  91. 91
    darrylzero said:

    Maybe another hypothetical example would get us off the Ibanez tip for a minute, which might allow us to think a little more objectively. If you’d rather not go down this road, feel free to delete or just not respond, but I’d like to see a brief outline of how you would apply this to evaluating the prospects of J.D. Drew hitting this year. Maybe that should wait for the projection post, though. Apologies if so.

  92. 92
    Chris Miller said:

    Raul has been injured, or at least has said he has been, so I suspect that’s part of what was going on.

    Sometimes players who aren’t playing well will just say “this and that” caused the slump though.

  93. 93
    Chris Miller said:

    #91, you DO want to weight the most recent performance, more than say 3 years ago, just not as much as people do. I think JD Drew is not as good as the guy Boston picked up, but not as bad as this season suggests.

  94. 94
    Chris Miller said:

    I think JD Drew is not as good as the guy Boston picked up

    Should read:
    I think JD Drew is not as good as the guy Boston THOUGHT THEY picked up

  95. 95
    carcinogen said:

    89: Dave is using a very common analytical approach. He’s gotten Baker to agree with a proposal in the abstract (the jab), next he’ll hit him with the specific examples (the right cross), and Baker will have to relent.

    Moooohaahahaaahaaaa….ok, so I overstate. But damm I love this stuff!

  96. 96
    Dugan said:

    OscarM Says: … You can produce these statistical arguments all you like. I would be shocked if Ibanez, in particular, gets any significant downtime now even when the “streak” ends, if it hasn’t already.

    Amen - IMHO, Johnny Mac has decided that Ibanez is the better option and Jones is going to ride the pine most games.

  97. 97
    Jeff Nye said:

    Yeah, I wasn’t that impressed with Geoff’s response myself, despite him having the best first name ever (it’s misspelled, though!)

    He seems entirely too willing to dismiss You Know Who’s last two awful years as being entirely due to injury rather than decline, while believing that his post-ASB “hot streak” (which I’d define as more of a lukewarm streak) is predictive of his near-future performance.

    The tone of his post also implied a disdain for AAA statistics, in regards to Adam Jones, which I can’t quite fathom.

  98. 98
    DMZ said:

    As Dave requested — could we perhaps not make this about the M’s, or even the particular players in question, or Jones v Ibanez or whatever?

    The post is about the predictive value of hot streaks. All other questions will be answered in due time.

  99. 99
    Robo Ape said:

    First of all, great post Dave; probably my favorite of the season if only because, as an anthropologist, I’ve read countless papers about this phenomenon and it’s (typically incorrectly used) role in human decision-making. There are a couple of things I want to bring up, however.

    To begin, the most famous example from the academic canon investigating the phenomenon of human beings erroneously predicting future success based on perceived recent success is Gilovich, Vallone, and Tversky’s “The Hot Hand in Basketball: On the Misperception of Random Sequences.” (Cognitive Psychology, 1985). Fundamentally, the argument in the paper is the same you are making here but with an important distinction; specifically, that the concept of hot streaks in basketball is a fallacy in and of itself. Other than extreme outliers, a basketball “hot-streak” does not truly exist in the way sports fans perceive it to. In basketball, the streaks (either hot or cold) are rarely sustainable for any significant portion of time. As a result, the tendency for basketball teams to give the ball to a player who is “hot” is a negligible or detrimental decision. As JLC says in 33: “The… human tendency is to see patterns, even when there are none, and that screws us up all the time.” This is so, so true.

    That said, I know that numerous cognitive psychologists, economists, and social anthropologists, after reading the Gilovichh, Vallone, and Tversky paper attempted to apply the argument to baseball but ran in to a problem. In baseball, unlike basketball, true hot and cold streaks do seem to exist. This creates a massive problem for predictability because of the moving window effect. Since statistically significant hot-streaks do occur is baseball, on a fundamental level at least, when a player is “hot,” fans are often correct in thinking he will continue to do well. They are just as often, of course, incorrect. The trouble comes from where a hot-streak begins and ends. We might have, for instance, decided to bench Raul after his hot streak prior to the most recent home-stand. So far as I can tell, that would have been a mistake. Of course, a hot streak can’t last forever, but the point is that players do get hot, which even this analysis acknowledges.

    I’m not arguing against anything you’ve said, really, but I think this is an important distinction to consider when analyzing the hot hand fallacy in baseball as opposed to other situations.

  100. 100
    panman said:

    Very thoughtful post as always. But your argument boils down to “regression to the mean” which if applied to Ibanez would indicate his performance across the remaining 6 weeks should be enough to offset his statiscally aberrent perfomance for most