Why mathematicians sometimes get Covid projections wrong

This post is adapted from my Guardian Article of the same title originally published on 26/01/22

Modelling may not be a crystal ball, but it remains the best tool we have to predict the future

Official modelling efforts have been subjected to barrages of criticism throughout the pandemic, from across the political spectrum. No doubt some of that criticism has appeared justified – the result of highly publicised projections that never came to pass. In July 2021, for instance, the newly installed health secretary, Sajid Javid, warned that cases could soon rise above 100,000 a day. His figure was based on modelling from the Scientific Pandemic Influenza Group on Modelling, known as SPI-M.

One influential SPI-M member, Prof Neil Ferguson, went further and suggested that, following the “freedom day” relaxation of restrictions on 19 July, the 100,000 figure was “almost inevitable” and that 200,000 cases a day was possible. Cases topped out at an average of about 50,000 a day just before “freedom day”, before falling and plateauing between 25,000 and 45,000 for the next four months.

It is incredibly easy to criticise a projection that didn’t come true. It’s harder, however, to find out which were the assumptions that made the projection wrong. And, of course, it’s harder still to do that before the projection has been shown to be incorrect. But that is what we ask our modellers to do, and we are quick to complain when their projections do not match reality. Much of the criticism they have received, however, has been misplaced, born out of fundamental misunderstandings of the purpose of mathematical modelling, what it is capable of – and how its results should be interpreted.

Mathematical models are predicated on assumptions – how effective the vaccine is, how severe a variant is, what the impact of imposing or lifting whole rafts of mitigations will be. In trying to put a number on even these few unknowns, let alone the tens or even hundreds of others needed to represent reality, modellers are often searching in the dark with weak torches. That is why broad ranges of scenarios are modelled, and why strict caveats about the uncertainty in the potential outcomes typically accompany modelling reports.

Mathematicians will be the first to tell you that the output of their models are “projections” predicated on their assumptions, not “predictions” to be viewed with certainty. To be fair to him, when Ferguson suggested the figure of 200,000 cases a day, he placed it in the context of the substantial uncertainty surrounding the projection. “And that’s where the crystal ball starts to fail,” he said, “… it’s much less certain.”

Unfortunately, such caveats often get lost when modelling is simplified and turned into attention-grabbing headlines. One accusation levelled at UK modelling is that projections are often presented in the media with insufficient accompanying context. While it isn’t always possible to expect modellers who are working flat-out to find time to do media rounds, the resulting communication vacuum can leave results open to misinterpretation or exploitation by bad-faith actors.

Critics of modelling also fail to acknowledge that highly publicised projections can become self-defeating prophecies.Top of the list of the Spectator’s “The ten worst Covid data failures” in the autumn of 2020 was “Overstating of the number of people who are going to die”. The article referred to the fact that Imperial College modellers’ infamous projection – that the UK would see 250,000 deaths in the absence of tighter measures – never came to pass. The Imperial model is widely credited with causing people to change their behaviour and with eventually ushering in the first UK lockdown a week later, thus averting its own alarming projections. Given that the UK has already passed 175,000 Covid deaths, it isn’t hard to imagine that upwards of 250,000 could have died as the result of an unmitigated epidemic wave.

There have been scenarios in which modellers have taken missteps. Modellers often attempt to answer questions about subjects on which they are not experts. They need to collaborate closely with individuals and organisations who have relevant expertise. When considering care homes in the first wave of the pandemic, for instance, a number of salient risk factors – including the role of agency staff covering multiple care homes – were known to industry practitioners but were not anticipated by the mathematicians. These considerations meant that recommendations based on the modelling may have been unsound. There were more than 27,000 excess deaths in care homes during the first wave of the pandemic in England and Wales.

Data sharing between modelling groups has also been identified as an area that needs improvement. Early on in the pandemic, unequal access to data and poor communication were implicated in modelling results that suggested the UK’s epidemic trajectory was further behind Italy than it was, possibly contributing to a delay in our first lockdown. In these respects the pandemic has been a very public learning process for mathematicians.

Every time someone interprets data – from professional mathematicians and politicians to the general public – they are using a model, whether they acknowledge it or not. The difference is that good modellers are upfront about the assumptions that influence their outcomes. If you don’t agree with the underlying assumptions then you should feel free to take issue with the projections – but dismissing conclusions because they don’t fit a worldview is naive, at best.

Despite these reservations, modelling remains the best tool we have to predict the future. It provides a framework to formalise our assumptions about the scenarios we are trying to represent and to suggest what might happen under different policy options. It is always a better option than relying on the gut feelings, “common sense” or plain old wishful thinking that would replace it.