If You Think Modelling the Future is Dangerous, Try Modelling the Past

When bovine spongiform encephalitis (BSE) hit the U.K. in the 1980s, computer modelling predicted that without intervention new variant Creutzfeldt-Jakob disease (vCJD) would kill tens of thousands of people. The U.K.’s response to BSE was to slaughter millions of cows as a way of halting the potential spread of the disease into the human population. In the end, about 3,000 people have died of CJD within the last 30 years and of these only 178 can be attributed to vCJD.

Was the modelling useful?

Surely yes! The modelling saved thousands of lives due to its timely prediction of the impact of BSE on the U.K.’s population if we didn’t do something. And luckily, we did do something, and because we did something there were only a couple of hundred deaths from vCJD… although the cows weren’t so lucky.

Welcome to the world of the counterfactual model and predicting what might happen if we ‘do nothing’.

Fast forward to COVID-19 and we found ourselves yet again in a situation where computer modelling was predicting thousands of deaths from a new disease unless we did something. And yet again, these predictions became part of the rationale for a series of major national interventions to curb the threat from this new disease, only this time the cows got off lightly and it was the humans who suffered.

Of course, it’s much more likely that the modelling predictions were wildly inaccurate, but it’s easy to see why they carried such sway in times of uncertainty. Their apparent sophistication, and general impenetrable-ness to all but a select few, together with the fact that their outputs were promoted by learned individuals with strings of letters after their names, created the illusion of providing powerful future knowledge… an objective crystal ball that should underpin difficult policy decisions. The problem is that, parking the fact that some models have major technical problems, they suffer from two much more basic fundamental problems that should limit their influence.

The first is the obvious fact that our knowledge of biology is incomplete, especially that of a novel disease, and so all models of such complex biology will also be incomplete. We may be aware of the gaps, and so make educated guesses in an effort to plug them, but it’s not the case that we know where all these gaps are or even that there are gaps at all. We may be completely ignorant of key relationships or important variables in the system. There may well be important ‘unknown unknowns’ in the biology of which we are blissfully ignorant and so, by definition, things we cannot model. It therefore falls to the modeller to fill the gaps (that they are aware of) and to be the arbiter of ‘truth’ in the disease biology. As such, these models are built upon the assumptions of the modeller and although many assumptions may be non-contentious and generally agreed on, this is not true for all assumptions, especially when it comes to modelling complex biological systems where knowledge and data maybe sparse, inconclusive, or even contradictory. It will be up to the judgement of the modeller to decide what to include in their model and this judgement will reflect the modeller’s own opinions and biases. This in turn will affect the predictions of the model itself.

So, far from being objective ‘crystal balls’, models reflect the prejudices, knowledge, ignorance, and preconceptions of the modeller. Meaning that regardless of who builds them, and regardless of how many wonderful graphs and pictures they spit out, unless their predictions are actually tested and found to have some element of truth, such computer models are effectively the codified opinions of the modeller. And opinion is the weakest form of clinical evidence.

The second important point is that counterfactual models don’t tell us what we should do only what might happen if we don’t.

Something that was striking about the use of models in the COVID-19 briefings was the fact that they were not used to make predictions of what would happen but to describe what wouldn’t. This was because, just like in vCJD, models were used to paint a picture of a future, caused by ‘doing nothing’, which we were going to avoid by ‘doing something’ and as such they were not predictions of the outcomes of our actions, but our inaction. And just like when we killed all the cows, we certainly did something to avoid the futures predicted by these pieces of computer code: lockdowns, masks, screens, school and business closures, social distancing… the whole COVID-19 dance. Having ‘done something’ we then saw that none of the modelling predictions came true and so, using the argument I used above for vCJD, concluded that by ‘doing something’ we successfully avoided the thousands of deaths that would have resulted from ‘doing nothing’. Time for drinks at Number 10!

Such fallacious circular reasoning seems to abound when it comes to the use of these kinds of models: first, we start by assuming that such counterfactual models have some level of truth in prediction, even though the models and their predictions are untested and unvalidated. Secondly, we then assume that in ‘doing something’ we are actually avoiding the outcome predicted in the ‘doing nothing’ scenarios, even if there is no relationship between the proposed ‘doing something’ and what is codified in the model. Finally, if the ‘doing something’ results in real world outcomes that are better than those predicted by the ‘doing nothing’ modelling we then take this as evidence to implicitly validate the modelling predictions we used to justify ‘doing something’. Counterfactual modelling would therefore appear to be unique amongst scientific disciplines because it creates unproven hypotheses that we do not want to explore, encourages the use of interventions that it does not explicitly predict will be effective, and makes predictions that are confirmed by not testing them. This is called ‘following The Science’ by policy makers.

The fact is that counterfactual models of COVID-19 are not a rationale for lockdown or any other intervention because they do not predict the impact of these interventions. They model NO intervention; it is policy makers and bureaucrats who decide what should be done. It is the very fact that such models make such dire predictions which provides the strong incentive to not test them by doing nothing. In fact, if one thinks about it, there is a perverse incentive for counterfactual models and modellers to paint the worst possible picture. After all, nothing substantial happening is no justification for action whereas the more dire the predicted outcomes, the more likely we are to do something to avoid them and the greater the claim we can make of lives ‘saved’ as a result. Indeed, it turns out that’s why SAGE didn’t bother modelling anything other than the reasonable worse case.

That said, counterfactual models and modellers are still making predictions and so all we need to do to test them is to do nothing and wait and see what happens; we turn the counterfactual into the factual. Which is essentially what happened after ‘Freedom Day’ and then during Christmas 2021 when the dire predictions of the modellers were ignored and in both cases the scenarios of thousands of deaths and hospitals overwhelmed failed to come to pass. The COVID-19 models were proven to be wrong, and if they were wrong then, then they were always wrong.

Predicting the Past

There is an old saw that says “prediction, especially about the future, is hazardous”, and it is certainly the case that the modelling predictions of COVID-19 futures have been truly hazardous for us all. But as we try to put COVID-19 behind us, a new form of counterfactual prediction is emerging which might be even more hazardous, and that is using models to predict what might have happened in the past.

When Professor Neil Ferguson stated last year that if the national lockdown had been instituted even a week earlier “we would have reduced the final death toll by at least a half”, one assumes he was making this statement based on the output of a model. In a similar vein, modellers at Imperial have also argued that Sweden should have adopted lockdowns to save lives and that the COVID-19 vaccines have saved 20 million people from an untimely death. We have also seen modelling papers being published that support use of lockdowns over focused shielding efforts by predicting what would have happened if we had made these more modest interventions. We can expect more and more of the same.

Unlike the ‘do nothing’ models of counterfactual predictions, such re-imaginings of the past aim to produce new counterfactuals based on different assumptions about what could have been done. Obviously, such models also suffer from the problem of the incompleteness of biological knowledge, but they have an even more fundamental scientific issue and that is that the ‘predictions’ they make are intrinsically untestable. After all, we cannot go back in time and see if the modellers were right about what would have happened if we had done something differently. So, any claim that these models are uncovering a scientific truth or proving (or disproving) anything is false. Because what defines science as a practice is the very fact that hypotheses can be tested, and their validity or invalidity determined… beautiful theories can be slain by ugly facts. It doesn’t matter how much science goes into a model, if the predictions and simulations it produces are untestable then they are, and will always be, just an opinion, a point of philosophy, an article of faith. They don’t ‘prove’ anything.

There is also huge potential for circularity and confirmation bias in developing models of the past. Imagine that you wished to model the impact lockdown had on SARS-CoV-2 spread in the pandemic, how would you do this? One way one could be to assume (because of less social mixing) that there was less transmission in lockdown, and as a result we adjust the ‘R’ number so that it was lower in lockdown than without. Lo and behold, now when we model lockdown vs. no lockdown, lockdown produces a better result due to a more rapid decline in infections. But this is completely circular: we assumed lockdown reduced transmission and our model then shows less transmission. Likewise for any other intervention (pharmaceutical or not), the temptation is to assume its level of effectiveness… and then model its effectiveness. Similarly, for approaches that the modeller does not like (for example, focused protection), assumptions about lack of effectiveness are also made… neatly demonstrating that more modest interventions would not have been as effective. Such circular arguments in these modelling efforts may not be as obvious as my example here, but it’s easy to see how the assumptions and biases (whether explicit or implicit) of the modeller can be baked into the model well before they even think of hitting the ‘run’ button.

As academic exercises, the views of modellers about what could have happened – both about the future and the past – are perhaps interesting as intellectual endeavours, but the danger lies in the way they’re used by policy makers. Just as counterfactual predictions were used to justify ‘doing something’ during the pandemic, so these retrospective re-imaginings of the past are used to validate that the ‘something done’ was the right course of action. However, unlike modelling the future, which is hard enough to test, there is absolutely no way to demonstrate that such modelling output is either right or wrong because in the absence of a time machine we cannot go back and test their predictions. There are no ‘Freedom Days’ – which provide us with a crude way of testing the dire predictions of future modellers – just an endless series of imagined what-ifs. The reality is that such models are more akin to the computer renditions of historical places, or the fantastical simulations of distant planets found in films and games which exist as binary code but are not anywhere we can visit and explore… they look solid on the screen, but their foundations are not real and their walls are pixel thin.

The trouble is that when it comes to COVID-19 the outputs from such modelling efforts can be pounced upon and reported as ‘Scientific Truth’, especially if they support the perceived wisdom or accepted narrative. So, we intend to spend many hours, and lots of money, trying to understand our responses to the pandemic, but when interrogating the role of modelling and modellers we allow the predictions of the models themselves to become the ‘evidence’ of their validity. Evidence of alternative histories avoided, whether old predictions of a future that did not happen or new predictions of a different past. Evidence that we did the right thing or should have done it harder or faster or longer. Evidence that other approaches would have been much worse. Evidence that costly, unsafe, intrusive, and ineffective interventions worked for COVID-19 and so should be used again in the future.

Predicting the future is indeed hazardous, but it might turn out that when it comes to COVID-19 predicting the past is far, far more dangerous.

George Santayana is the pseudonym of an executive working in the pharmaceutical industry. Thanks to Mildred for critical reading and comments on this article.