Lack of consistent predictability plagues not just the Cubs, but most teams' bullpens

It was a rocky weekend for Chicago's bullpen, which coughed up late leads on back-to-back days.
Rick Osentoski-Imagn Images

Well, the Chicago Cubs coughed up another one Saturday. Gunnar Henderson’s three-run eighth inning home run off Caleb Thielbar turned a 3-1 Cubs lead into a 4-3 defeat. It doesn't help matters that the bullpen almost blew it again on Sunday before Justin Turner's walk-off heroics instead propelled the club to a series win.

But let's stay focused on Saturday for the time being. Chicago coughing up a lead isn't especially surprising right now. It was the fourth time since the All-Star break (Sunday was the fifth) that the Cubs lost a lead, and the second (again, Sunday the third) time late-inning bullpen work was responsible. On Saturday afternoon, the victim was Matthew Boyd, lifted after seven shutout innings and 92 pitches.

Pulling your staff ace and turning the outcome over to the pen is one of those managerial moves that grates on a fan base, as it should. The nerds in the front office have all kinds of rationales for moves of this sort – that’s why those moves are done not only on the North Side but across the MLB universe.

The problem is that in all their rationales, the one aspect the nerds fail to consider is among the most important: reliability. On Saturday, Craig Counsell cashiered his most reliable pitching options in favor of one of his least reliable…and paid the ultimate price.

Having given away a game they should have won, the Cubs enter the week two games behind the Milwaukee Brewers in the NL Central. If they lose the division – or even worse, get supplanted by the Reds in the wild card race – we know why.

I have been cross-examining bullpen strategy since writing “The Book On The Book” in 2004. That book contains data demonstrating that despite a half century of changes in bullpen strategies, the percentage of games actually won by teams holding late-inning leads of one to three runs has barely changed at all…and in some cases it has gotten worse since the days when starters went the distance.

The problem with bullpen-reliant late-game strategies gets back to that ‘reliability’ word, which is a byproduct of the small number of innings on which reliever reliability is assumed. Unfortunately, since front office nerd strategies are so inexorably reliant on probability, it will be necessary to fight math with math. I apologize in advance.

For any pitcher, the best measure of game-to-game performance reliability is standard deviation. Numbers like ERA and WHIP are important, but what isn’t considered is the likelihood that on a given day the pitcher in question will deliver the expected results.

The table below lists 2025 reliability data for 10 Cubs pitchers entering Sunday: the top five starters, four key relievers, plus swingman Colin Rea. The pitchers are listed in order of the predictability of their performances. The standard of measurement is baserunners allowed per batter faced.

Pitcher

Avg. runners/bf

Std. Deviation

1 SD MIn.

1 SD Max.

Shota Imanaga

.24

.08

.16

.32

Cade Horton

.31

.09

.22

.40

Jameson Taillon

.28

.10

.18

.38

Matthew Boyd

.26

.11

.15

.37

Ben Brown

.33

.11

.22

.44

Colin Rea

.30

.12

.18

.42

Brad Keller

.18

.16

.02

.34

Caleb Thielbar

.19

.18

.01

.37

Daniel Palencia

.19

.19

.00

.38

Ryan Brasier

.21

.20

.01

.41

This may be a complex chart for non-nerds, but let me try to simplify it by focusing on the Imanaga line. That line says that on average 24 percent of opposing batters reach base against him, but that the first standard deviation of that average is a very small .08. Translated, we can really expect with about 70 percent certainty that on any given performance Imanaga will allow between 22 percent and 40 percent of opponents to reach base.               

Reliability is all about small spreads in standard deviation. In that vein, note that while their average performances may vary - Ben Brown allows too many baserunners generally, - all six primary Cub starters produce significantly more predictable results than all four primary relievers. The relievers do better on average – that’s the first column -- but day-to-day those performances vary widely.

Looking back at Saturday’s gut-buster, when Craig Counsell lifted Boyd for Ryan Brasier to start the eighth, he pulled a pitcher with both good performance and good reliability for one with a poor average and the highest unreliability of any Cub mound regular.

Thielbar, who surrendered the game-turning home run to Henderson, has a better average of baserunners per opponent than any of the starters and is more reliable than Brasier. But his performance level is still less predictable than Boyd's.

Modern pitching strategies are founded on the demonstrably false presumption that a reliever summoned from the pen will produce something close to his statistical average line at that particular moment. The table above establishes why that presumption is so often a delusion, and we saw it play out in frustrating fashion this weekend at Wrigley.