Meaningless Numbers That Are Meaningless
Fans spent years pleading for the Mariners to move in the fences at Safeco Field, because they wanted to watch the Mariners be a different kind of bad. There was a pretty sensible argument to do so, though, in that the ballpark before was simply too extreme, so at last the organization got to work last winter. Citing young-hitter psychology, the Mariners made the ballpark more batter-friendly in left, left-center, and center. Today Jesus Montero is in the minors, Dustin Ackley is in the minors, and Justin Smoak has three home runs in the season’s first third, but this sentence is hardly fair. It’s accurate, but it’s misleading and unfair, yet I’m leaving it in as part of an experiment. I won’t get into that part because you’re not supposed to know about it.
The potential effects of the Safeco changes were underwhelmingly studied, privately and publicly. It stood to reason that homers would increase, because the walls would be closer, and it’s just that simple. Less ground would have to be covered by fly balls. But park effects are almost inconceivably complicated, and the changes could also have effects on doubles and triples and singles and maybe even walks and strikeouts and everything else. The only way to know what would happen would be to make the changes and find out. We’re in the initial stages now of finding out. Safeco should play differently, but just how differently are we talking?
Let’s set a baseline and look at Mariners games in Safeco and away from Safeco between 2008-2012. I combined numbers for hitters and pitchers, for the Mariners and for their opponents, and following, you’ll see the Safeco numbers divided by the road numbers, expressed as a percentage. To get to it:
BA: 94% (Safeco BA was 94% road BA)
OBP: 97%
SLG: 90%
HR%: 81%
HR/BIP: 83%
These are all very simple, straight-forward numbers, unadjusted for quality of competition and such. The samples are each more than 30,000 plate appearances. Let’s look now at how the numbers appear so far for the 2013 season. The Mariners have played 26 games at home, and 32 games on the road. There have been more than 2,100 Safeco plate appearances, and nearly 2,600 road plate appearances. Straight-forward again and unadjusted again:
BA: 101%
OBP: 99%
SLG: 91%
HR%: 72%
HR/BIP: 73%
The respective batting averages are .247 and .244. The respective on-base percentages are .303 and .307. Essentially identical. But Safeco’s still reduced power — even more so than it used to. Home runs are down, relative to road games, and that carries over to isolated slugging percentage. The Mariners moved in the Safeco fences because they wanted more home runs in Seattle. So far, they’ve observed the opposite effect.
That’s poorly worded. There have been more home runs in Seattle. Mission accomplished! But there have been a lot more home runs away from Seattle, such that the ratios don’t match up. Such outcomes would’ve been unexpected. The Mariners thought this would be a gimme.
So what’s going on? Did the Mariners screw even this up? Did the Mariners somehow render Safeco even less dinger-friendly than it was before? Did the Mariners put too little thought into what they were doing, leaving them with a ballpark that isn’t what they wanted? Moving in the fences and still not getting homers — that would be very Mariners of them. It seems like a thing that would happen.
What’s actually going on, of course, is nothing. Park effects, as stated, are complicated, and you’re not going to learn about them based on two months of one season. It’s going to take years before it’s clear what the new Safeco is, because it takes a while for these numbers to sort of stabilize. 26 home games. 26 home games! That’s nothing! And the schedule’s been skewed! You’d have to adjust for opponents, and even then, 26 home games.
And it’s been April and May, in Seattle. The ball doesn’t fly so well in April and May in Seattle. This is a very inadequate look at the data. It doesn’t control for participants, it doesn’t control for the time of the year, and it doesn’t mind the sample sizes. The only real conclusion we can draw is that, so far, there haven’t been many homers in Safeco, compared to on the road. But it won’t continue that way. The home-run split is evidence in and of itself of why the sample size is insufficient. It makes no sense why dinger rate would go down in Seattle after the changes. The fact that it has speaks to the other fact that we can’t prove anything yet, and won’t be able to for some time.
So why post this at all? In part, to sate my own curiosity. Just because we only have small-sample numbers doesn’t mean I don’t want to look at them anyway, in the way that we look at player’s statistics in the middle of April. In bigger part, this is a test. A test of critical thinking and careful reading and data interpretation. As you were reading along, did you spot the problems with the analysis? Were you eager to leave me a comment, pointing out that it’s too early to say anything? Congratulations! If not, why do you suppose that is? Do you trust numbers when you see them online? Do you trust numbers when I’m the one posting them? (Ed. note: awww) There’s a lot of baseball analysis on the Internet. A lot of it sucks. There’s a lot of general analysis on the Internet. A lot of it sucks, too. One of the greatest skills you can possess is the ability to break down an argument. So many of them are flimsy, or insufficient, and more people would notice if more people were paying more careful attention. Don’t read passively. Read actively! That’s for you, Mr. D’Onofrio! I mean, don’t be a dick about it, but keep thinking. Think for yourself, as someone else is explain his or her thought process. It can be satisfying and illuminating and it can prevent you from being misled.
So that’s how this post turned out. I’m kind of surprised, myself. You’d think I’d have an outline for these things. Whatever, it’s a baseball blog.
Comments
10 Responses to “Meaningless Numbers That Are Meaningless”
Leave a Reply
You must be logged in to post a comment.
Not sure if just these numbers are meaningless, or if all numbers related to the M’s are meaningless.
Why did you compare the home / road numbers? If the effect is to study SafeCo, wouldn’t it have been better just to look at home stats? Or, at least, start with those, to see if the overall home run environment has increased (for both home and visiting teams)?
Since the Mariners are hitting more home runs this year than they have in previous years, we could just be seeing a skewing in one direction based on that. I mean, if, by this point in previous seasons, the Mariners had only hit an average of 40 home runs and 32 of those were road and 8 were at home, whereas this season they’d have hit 80 home runs and 58 of those were on the road and 22 were at home, that could still reflect an overall net increase in home run allowance by new SafeCo, even if the home totals are still the same general percentage, or even a lower percentage, of road numbers.
Or am I just way off here?
Typical – Wouldn’t that negate different players during different years? For example, if the Mariners had a team full of players like Brendan Ryan in the past (not too much of an exaggeration) and you compare those home stats to a team that now has Pujols, Cano, and Fielder (making a drastic comparison here), the home stats would suddenly look much better compared to home stats of prior years. Weighing the home and away stats for individual seasons smooths all of that out (in theory). That’s my thought on the matter at least.
You need to compare to the road to establish a baseline. It’s the only way to control for the fact that team composition changes. By looking at one split, you’re capturing ballpark + players. By comparing both splits, the players, in theory, cancel out. Of course, they don’t quite do that so soon in the first season.
Yeah, I guess so. The sample is small, as Jeff already points out, so maybe it does us no good to only look at what, primarily, Raul, Morales, Morse, and Bay are doing to skew the numbers. So perhaps look at all participants? One of the talked about issues with moving in the fences was that while we might hit more HR at home, we’d also give up more. Shouldn’t that be examined as well to see if the rest of the league is taking advantage of things?
Looks like Jeff included both the Ms and their opponents:
Since it does include the opponents, what I wonder is, how much of the observed difference can be accounted for by Joe Saunders’ home/road splits?
As an additional exercise, since these numbers are meaningless anyway, except they do mean something even if we don’t know what it is, why not compare 2013 with the same time frame for 2008-12? Of course a problem is that Spring in the PNW can be very different from year to year*, but just for the exercise.
*In 1992 I was able to drive to >4000 feet elevation near Mt. Adams on March 13; a couple years ago, it took until June, a 1/4 of a year difference, to get to the same place.
“if, by this point in previous seasons, the Mariners had only hit an average of 40 home runs and 32 of those were road and 8 were at home, whereas this season they’d have hit 80 home runs and 58 of those were on the road and 22 were at home, that could still reflect an overall net increase in home run allowance by new SafeCo, even if the home totals are still the same general percentage, or even a lower percentage, of road numbers.”
Wouldn’t those numbers result in this:
Prev seasons: home/road = 8/32 = 25%
This season: home/road = 22/58 = 38%
The numbers you’re providing would show an increase in Safeco’s home-run friendliness, by pretty much any measure. 22 HRs vs 8 HRs using naive numbers. 38% vs 25% using a better measure. Better still would be to include visiting teams’ stats, as Jeff does. And I presume this is all done per game or better still per inning (or even per plate appearance), rather than looking at raw totals (i.e. what if they Mariners had played 30 games at home in previous seasons, and only 20 this season).
But the bottom line is that simply looking at the home numbers gives a skewed view. It’s like looking at a team’s win total — and ignoring their loss total.
I can’t tell, which is why I’m asking: Did you compare the last five years in April and May? You said this year is 26 games in April and May, but as far as I can tell you’re comparing it to the last five years, entirely.
Ah, PackBob was asking the same question an hour ago.