Data in the Dugout: A Q&A with Sports Geek Author Rob Minto

Sports Geek: A Visual Tour of Myths, Debates, and Data is an amazing, quirky data and infographic-filled tour of the biggest sports around the world. We caught up with author Rob Minto to chat about his research as Major League Baseball’s 2016 World Series between the Chicago Cubs and the Cleveland Indians heats up.

Neither the Cleveland Indians nor the Chicago Cubs have seen a World Series in over 60 years: what can data and statistics tell us to expect in their match-up?

It’s tempting to look at historical parallels, but really, they mean very little. It’s just great narrative for the fans and TV, and there’s nothing wrong with that.

On a very simple metric of runs per game, both scored and allowed, over the regular season, the numbers show that Chicago allow 3.43 runs per game, and score 4.99. Cleveland allow on average more runs – 4.20 per game – and score less, at 4.83 per game. And the Cubs won 103 games in the regular season, to the Indians 94.

Therefore it’s tempting to say that if both teams play like they did all season, the Cubs win, hands down.

But of course, that is a big ‘if’. And game 1 certainly didn’t go that way, with the Indians winning 6-0.

What is worth remembering is that baseball is quite a random game, which is why it needs a long season to work out which teams deserve to be on top. In comparison, football requires fewer games to determine the better teams, given the more physical nature of the sport.

In fact, the World Series at (best-of) seven games is still the equivalent of just 4.32 per cent of a MLB season; the Super Bowl, just the single game, represents 6.25 per cent of the regular season. In other words, looking at season trends will only tell you so much. Form and luck will play bigger parts.

You mention the Moneyball effect in your book: does that trend suggest that data is really everything in baseball?

Data isn’t everything, despite the Moneyball revolution. Data is very good at telling you what the eye can’t see. But statistics, in their cruder form, often don’t tell you about risk, or situations. A player who tries something risky because the game situation demands it might end up with a worse average. That doesn’t accurately reflect ability.

In fact, teams are now starting to recruit using ‘soft’ metrics that data can’t tell you, such as looking at temperament, sociability and other factors, in order to build a team.

It’s worth remembering that sports teams look to find comparative advantage. Data is just one way of doing that. Valuing data for the sake of itself leads to missing other things that could be just as valuable. That being said, gotta get into the data.

What impact are the doping scandals of the early 2000s having on the game today?

The doping scandals forced players off steroids, and that de-powered the game. Less power meant fewer home runs.

That put a premium on stealing bases for a while, but that has since been shown to be a bit too risky a strategy.

So what now? Home runs are on the up again – which might be due to stronger players, or weaker pitching, or both, or something else altogether – we can’t be sure.

What do TV rating for the World Series suggest about the trajectory of the sport?

TV ratings are a big deal: it’s the best barometer we have for interest and health in the game other than attendance. And the trajectory is not good. Nationally, attendance and TV viewings are both falling or flatlining.

Of course, ratings are not generated in isolation. There are trends for watching TV in general that should be looked at, plus other distractions such as elections or other news events. Importantly, are other sports seeing the same pattern?

To a degree, yes: this is a problem repeated elsewhere, including the NFL. The question is how sports administrators respond. They can do nothing and hope that the cycle turns in their favor. Ot they can make changes. Change is, of course, difficult. If there are too many games, or not enough meaningful games, the answer is to cut the number of teams, or alter the structure of the league. But try selling either of those ideas to team owners.

How do you expect the shift in baseball ratings to play out in TV contracts in 2020?

In a word? Badly.

TV companies pay out big contracts in the expectation of viewers and advertising. If ad spend in the next few years is weak – which it probably will be, given the current ratings – then the networks will not want to get burnt again.

It depends on the auction structure of the deal, but it would seem odd for a declining market to attract a higher premium next time round. In which case, less money in the league would mean less money for players. Luckily for the clubs, there aren’t many other leagues for players to move to that command similar salaries. The result will, in most likelihood, be a stagnation in player wages.

Just for fun…Cubs vs. Indians: who wins?

Cubs. For Ferris.


Want to find out more about how sports and data collide? Sports Geek by Rob Minto is now available worldwide, and dives into an infographic tour of more than twenty sports. Discover the importance of data on the field here.

Stay up to date with all of our sports news and offers by signing up for our newsletter.

You can follow Sports Geek author Rob Minto on Twitter at @robminto.

This entry was posted in Baseball and tagged , , , , , , , , , . Bookmark the permalink.