Hitting the reset button on starting pitcher wins

Bradford DoolittleNov 17, 2017, 12:00 PM
Close

This week's results from the voting for baseball's major awards have been pretty much free of controversy. Everyone who has been honored seems to have deserved it, and there haven't been any clear-cut snubs for players based on widespread faults in reasoning or antiquated ways of looking at results. It's always tempting to chalk it all up to progress, but chances are, this is simply a year in which most awards have had obvious winners or, in the case of the MVP balloting, several good choices but no real front-runner to get worked up over.

While this is a general assessment of this year's voting, I did mildly disagree with the voting for the American League's Cy Young Award, in which the winner, Cleveland's Corey Kluber, drew 28 of 30 first-place votes. Boston's Chris Sale finished second, getting the other two first-place picks. I did not vote for the award, but if I had, I would have voted for Sale. I'm not sure that would have been the right pick, but I do feel like the race could have been closer than the voting reflected. Part of this has to do with wins, but almost certainly not in the way you think I mean.

There are good arguments for both pitchers, and clearly the voters looked at different arguments than I did. But I'm not surprised by the vote and can see a few good reasons why it went the way that it did.

First, while it's certainly true that Kluber had a stronger finish to the season, you have to assess the entire campaign from stem to stern, meaning that end-of-season numbers carry far more weight than any time-related subset of those numbers. Of course, that notion works in Kluber's favor if you're looking at traditional categories. His overall ERA (2.25) was a couple of ticks better than that of Sale (2.90). Kluber (18-4) also won more games than Sale (17-8) and had a higher WAR at baseball-reference.com (8.0 to 6.0). Case closed, right?

Maybe, and I think that Kluber was a reasonable and easily defensible choice. But there are a couple of reasons why I still prefer Sale. One is that Sale pitched more -- three more starts and 11 more innings. The way baseball has evolved, I look at innings and starts as being highly correlated with pitcher value. Sale's edge isn't huge, but then again, we're only going to find differences between these two at the margins. Pitching more isn't an advantage if you don't pitch well, but that's certainly not the case with Sale.

Next up is the metric that I refer to most on a day-to-day basis during the season. Sale's name was a constant at the top of this leaderboard, so it remains embedded on my brain. That metric is WAR. No, not the version I noted above, the other version. At FanGraphs.com, Sale's 7.7 WAR was slightly better than that of Kluber (7.3). The reason I prefer this version of the metric is because it better values the traits for which the pitcher has the most control. Sale, not Kluber, led the AL in fielding-independent ERA, though it was close. FanGraphs uses FIP in its calculation of WAR for pitchers, so Sale comes out ahead. (They also produce an alternate version that works more like the one at Baseball Reference. That version, as you'd expect, preferred Kluber.)

The fact of the matter is we're just not there in terms of having a consensus on how to contextualize pitching performance. Baseball Prospectus has a comprehensive metric called Deserved Runs Average, which adjusts a pitcher's numbers for everything under the sun (or the lights). Kluber led all MLB pitchers in that. He also led in wins probability added, for those who want to go with a win-expectancy/leverage-index methodology.

Another way to look at it is expected wOBA, based on the tracking data put out by the Statcast system. That metric, in theory, is a direct evaluation of the quality of each pitcher's pitches. Kluber and Sale both ended the season with an expected wOBA of .248, tied for second in the majors (minimum 2,500 pitches thrown), just behind NL Cy Young winner Max Scherzer (.242). This means the quality of their pitches was roughly the same. If that's the case, then Sale's .301 average allowed on balls in play seems unfairly punitive to his ERA. Kluber was at .267.

The disparity in BABIP seems especially worrisome when you factor in the well-hit averages allowed by each pitcher, as tracked by TruMedia: .117 for Kluber; .119 for Sale. They ranked first and second in baseball in that category, but it's a virtual dead heat. That suggests that Sale's disadvantage on balls in play was not directly related to how well the ball was struck against him. The bottom line is that the difference between these two pitchers, according to what they had control over, was razor thin. Because of that, I prefer Sale's superior ability to keep the ball out of play. That's reflected in the FanGraphs version of WAR, and that's why he's my guy, even if I'm apparently on an island in that respect.

Dissecting all of this can be maddening, which is most likely why Kluber's ERA simply carried the day for most. And that's why sometimes wins still carry an outsize importance in the balloting. That wasn't the case this year, but it was in 2016 when Rick Porcello and his 22 wins took home the AL Cy Young. No sane person that I have seen has suggested that Kluber's edge in wins of 18 to 17 this time around means anything, which I suppose is a sign that we're better at these things than we used to be.

There's no reason to relitigate the problems with pitcher wins. The statistic only gets worse with each passing year, as innings expectations and totals for starters continue to decrease. Nevertheless, wins aren't ever going to go away. They are too embedded into baseball's historical record, and we'll keep using them to compare current pitchers with those of the past, even if we use better measures alongside them. Also -- and here's where I depart with a lot of sabermetricians -- I'm not convinced that wins are useless. Maybe they are within the context of a single season, but if the sample sizes are large enough, such as over the course of an entire career, wins are a decent proxy for things such as durability and consistency. These are aspects of starting pitching that are important to track.

My method for doing this is embedded in my system for placing starting pitchers into performance tiers. The idea is to award wins and losses by simply looking at each game's pitching matchup and assigning the decision based on the game scores for the opposing pitchers. (There's a mechanism for breaking ties that we won't get into here.)

This won-loss counting method is no panacea as a bottom-line decider, but it does have a strong year-to-year correlation and is something I'll throw out at various times, especially when I get into updating starting pitcher tiers. Over the past three years, Max Scherzer is the MLB leader with 73 revised wins, followed by Sale (70) and Kluber (68). The leader by percentage is Clayton Kershaw, who at 64-17 has outpitched his counterparts 79 percent of the time. If not for Kershaw's back injuries the past two seasons, Scherzer probably wouldn't be spending so much time polishing his hardware. Here are the top 10 in mano a mano winning percentage for the past three years (minimum 50 games started):

Comparing the aces by revised wins

Pitcher	3-yr. W-L	3-yr. win %
Clayton Kershaw	64-17	.790
Max Scherzer	73-25	.745
Chris Sale	70-25	.737
Corey Kluber	68-25	.731
Zack Greinke	65-25	.722
David Price	56-22	.718
Stephen Strasburg	53-22	.707
Carlos Carrasco	61-26	.701
Madison Bumgarner	57-25	.693
Jake Arrieta	65-29	.691

While we can't reach a consensus on what the best way to evaluate pitchers is in 2017, I strongly suspect there is a strong consensus that the top four pitchers on that leaderboard are the consensus top four pitchers in baseball at the moment. And the six behind them shouldn't generate a lot of quibbles, either.

In theory, this measurement answers the simple question of which pitcher did his job better on a given day. You have a revamped version of a won-loss record, one that I'd argue makes a heck of a lot more sense than the one we've used for the entirety of baseball history. However, in what was a surprise until I really thought it over, I've found that over a multiyear period, this method of counting wins is highly correlated with the old method. That's why I've started to look at career wins and lifetime winning percentage with a little more respect than I used to when looking at things such as Hall of Fame cases. Admittedly, this is a subjective decision: I've decided that on a game-by-game basis, a key part of my evaluation of a starting pitcher is how often he outperforms his counterpart.

More often than not, this falls right in line with metrics based on pure run prevention, but when there is a divergence, there is usually an interesting story to be found about the pitcher or his team. And, of course, often the divergence is simply that the pitcher is toiling for a team that supports him with an inferior lineup.

You could certainly come up with a different method than game score for determining which pitcher is better on a given day. But it's a simple method that I like a lot, one which you could determine in your head if you were perusing box scores in a newspaper. (Or just looking at ESPN.com, where we list the game scores in the boxes of every game.) Bullpen melts down? Doesn't matter. The system is bullpen agnostic. Ballpark and weather differences? Doesn't matter. Each pitcher is facing the same conditions. Pitching to the score? Even if it that's a thing, it doesn't matter because you're just comparing the pitcher to his opponent, and they are competing in the same game.

This boxing-like method of comparing the day's starting pitchers is probably similar to the original thinking behind the decision to keep track of pitching wins and losses in the first place, which was made sometime before the invention of the automobile. Since complete games were the norm back then, the notion actually made a lot of sense. Not so much anymore. In this century, most advanced methods try to contextualize how good a pitcher is at keeping runs off the board, and those are the top-flight methods, even if we don't agree on which one is the best. This counting method augments that on a game-by-game, opponent-by-opponent basis. The aim is to recapture the original spirit of the pitching win. Still, more than anything, this can be looked at as a measure of consistency and durability.

Let's finish off where we started. By this mano-a-mano method of awarding wins and losses, who was better in 2017: Sale or Kluber? And were other elite pitchers even better? On the first question, the answer is: It depends. On the second question, the answer is no.

Sale's revised record this season was 27-5 over his 32 starts. Kluber was 25-4. These were the two highest win totals in baseball, and the winning percentages were also the two best. And you've got to love those records, right? An 18-game winner (Kluber) as a Cy Young? Yawn. But a 27-game winner? Now we're talking. The National League co-leaders were Arizona's Zack Greinke (24-8) and Washington's Gio Gonzalez (24-8). Scherzer went 23-8, the same "record" as the Yankees' Luis Severino over in the AL.

Sale and Kluber were one-two and neck-and-neck in a lot of areas, which is why it's disappointing to me the vote wasn't closer. Kluber's revised winning percentage was a little higher than Sale's, but just barely -- .862 to .844. They were the only two plus-.800 pitchers in MLB, so it's tough to say by this method who was better. Then again, it's a tough call to make by a lot of the methods we've quoted here. Again, I'd give Sale extra credit for pitching more games. Maybe you wouldn't. Maybe you want to start a petition barring me from ever having a Cy Young vote.

However, this is truly a question that has no wrong answer. Sale and Kluber not only provided more bottom-line value than any other pitchers in baseball in 2017, they also outpitched their starter counterparts more often this season than any other pitchers in baseball. That's good to know, right?