The Real Fault with Epidemiological Models

by Hector Drummond

Imperial College’s Professor Neil Ferguson has drawn a lot of criticism recently for the poor state of the code in his COVID-19 model:

This criticism, it should be noted, is not even directed at his original code – which he still refuses to release, so we can guess how bad that is. The criticism concerns the completely rewritten code that has been worked on by teams from Microsoft and Github.

However, Simon Anthony, sometime contributor to Hector Drummond Magazine, has recently written in the Critic magazine that the poor quality of Ferguson’s code is beside the point.

I actually agree with this claim. Of course, it is quite right that the poor quality of Ferguson’s code should have been drawn out into the open, and I think his critics should be congratulated for doing this. I also think it is revealing of the poor standards at work in general with Ferguson’s team. But, in the end, Anthony is right that the sort of modelling Ferguson is doing is not discredited by the fact that his own effort was so shonky, because any number of epidemiological modellers could have come up with similar analyses using impeccable code. If we make this the main point of criticism of Ferguson’s predictions then we risk being undone when another group backs his analysis up with a top-notch piece of coding.

Reliability

I want to focus on what I consider to be the real failure not just with Ferguson’s model, but most epidemiological modelling of this sort: the lack of proven reliability of these models. It doesn’t matter whether your code is the equivalent of a brand-new shiny Rolls-Royce or a beat-up old Ford Transit van that’s been up and down the motorway way too often with rock bands. What matters is that your model makes predictions that we can have reason to trust. Can we trust Ferguson’s predictions? I have no reason to think we can.

For one thing, we have heard many reports of his extreme predictions in the past which have failed to come true. To be fair to Ferguson, it may be that he has learned from all these past failures, and has recently perfected his model by testing it repeatedly against reality, and it is now constantly providing accurate predictions. But I have no evidence that this has happened. Months after Ferguson came to public attention – months in which he has received large amounts of criticism for just this point – he has declined to point anyone in the direction of any reliability tests for his models. So I have no reason to put any store in them.

Models are “speeded-up theories”

It is sometimes said that a computer model is just a “speeded-up theory”, and I think this is true. A computer model is not just a neutral bit of maths that we can nod through. Assumptions are made about how this part of reality works, and about how this part of reality can be imitated by a vastly simplifying piece of computer code and some maths. So a computer model embodies a theory about how this part of the world works.

This theory has no special status in the pantheon of theories. It’s like any other theory: it has to be tested against the world to see if it stands up – and not just tested in easy, artificial situations where it’s not hard to produce the right answers. Nor can it just be tested post hoc – that is, tested against scenarios which have already happened. It’s not hard to adjust a model so that it outputs the correct predictions when you already know what those predictions have to be. Post hoc analysis and adjustment is important, of course, but it alone doesn’t count as testing. Just like any other theory, the theory embodied by the model has to be shown to produce accurate results in advance. And in this particular field, you’d want to see it do so in a wide range of real-world situations. This is not the sort of theory where one decisive experiment can settle things one way or the other.

The demand for testing and reliability is not in any way controversial. Having worked in academia for many years, I saw scientist friends in various fields who would work with models, and their concern was always with the reliability of the models. You couldn’t just use any old model that suited you and which gave you the results you wanted. You had to use something with a track record, and which other people in your field trusted because it had independently proven its worth.

Models in other fields

Many models have proven their worth over the years in many fields. Engineers, for example, have long used computer models to design bridges and other structures. These models can be very sophisticated and complex, and usually involve advanced mathematics (the engineering commentators at my magazine and my Twitter page like to boast of their mathematical superiority over the likes of economists and epidemiologists). Despite that, these models generally work. They pass the reliability test. They have to work, of course: you can’t have your engineering firm build a bridge that falls down at the opening ceremony and kills everyone. So reliability testing is incredibly important in fields such as engineering, but even then they don’t always get it right, because these things can get extremely complex and difficult and mistakes happen.

Grant-funding incentives

There isn’t the same imperative to get it right in academic epidemiology. Neil Ferguson can keep making very high overestimates of the amount of people who will be killed by the latest disease, and he keeps his jobs and government appointments and still gets massive amounts of funding from bodies like the Bill and Melinda Gates Foundation. He won’t be sued and made bankrupt if he gets it wrong. The government didn’t require him to publish reliability tests for his models in order him to be part of the SAGE and NERVTAG committees.

Epidemiology seems to be one of those areas, like climate change, where model reliability matters far less than it should. This can happen to areas that become politicised and where the journals are controlled by strong-armed cliques. It can also be a consequence of modern academia, where the emphasis has shifted almost totally to funding success. Funding success in areas like epidemiology can depend on exaggeration to impress people with agendas and money to burn, like Bill Gates. In an objective field you would expect, after all, underestimates to be as prevalent as overestimates. Yet in this field, overestimates are rife. And the reason for this is the same as the reason why alarmism thrives in climate “science”: it’s because all the research money goes to those who sound the alarm bells.

Creutzfeldt–Jakob disease

The case of variant Creutzfeldt–Jakob disease (vCJD), which can be caught from eating meat from animals that had BSE, or “mad cow disease”) provides a telling example. In August 2000, Ferguson’s team had predicted that there could be up to 136,000 cases of this disease in the UK (and disturbingly, this article mentions that Ferguson and his team had previously predicted 500,000 cases).

A rival team at London’s School of Hygiene and Tropical Medicine developed their own model which predicted there would be up to 10,000 cases, with a “few thousand” being the best case scenario. Ferguson pooh-poohed the work of this rival team, saying it was “unjustifiably optimistic”.

I should note that Ferguson had made some lower predictions as well – in fact he made a wide range of predictions based on whether various factors, such as incubation periods, applied. But the fact that he laid into the rival team in this way tells us that he thought we were looking at the high end of the range.

Seeing as pretty much everyone who gets vCJD dies from it, this was serious.

So how many people died from vCJD in the UK in the two decades since? 178.

My point here isn’t just that Ferguson’s model was stupendously wrong (or, if you want to emphasise the very large range of predictions he made, useless for most practical purposes). The point is that even the team that performed better still greatly overestimated the number of deaths. Their model only looked good compared to Ferguson’s – his model wasn’t even in the right universe – but it was itself highly inaccurate and misleading, and not at all up to the job we required to be done.

The other advantages of bridge-building models over epidemiological models

Bridge-building models also have other advantages over epidemiological models. The principles of physics and chemistry that are involved are very well established, and have been worked on for a very long time by very many, and many great, scientists. Also, the basic principles of physics and chemistry they deal with don’t change. Some things in the field do change, of course – for example, new materials are constantly developed, and one must take account of construction techniques varying from place to place, and that mixtures of materials are not always quite right, and so on. But there is a great advantage in the fact that, for example, the laws of gravity don’t change.

Epidemiology, on the other hand, is dealing with things that are, in general, far less locked down, and which can change from decade to decade. Diseases have more-or-less different structures from one another. They don’t all behave alike. Countries vary from one another in various relevant respects (temperature, sanitary conditions, crowd behaviour, and so on). Medicine improves, but it’s not always well known how a modern medicine interacts with a certain disease. There is little that is fixed in a physics-like way with disease, and even for those things that are somewhat fixed our knowledge of some of the important detail is lacking.

The basics behind bridge-building models are not completely set in stone, of course, but they are much more settled than epidemiological models, which are trying to model far messier situations, with many more unknown parameters and influences.

Adding detail to make the models more realistic

The other problem with this messiness is that the models need to be made more and more complex to try to deal with all these extra factors. It appears that Ferguson has attempted to do this to some degree, which explains the length of his code. It also appears that the rewritten version of his model has added in even more of this sort of thing, for example, the Github page for it originally noted that it had now added in “population saturation effects”.

While I regard the attempt to capture the complexity of the real world as admirable, there can be no end to it when modelling some of the messier parts of reality, and you can end up needing a model as large as reality itself before your model starts to give you any reliable predictions. Models, remember, are attempts at drastic simplifications of reality, embodying various theories and assumptions, and there is no guarantee that any such simplification will work in any particular area. Sometimes they do, sometimes they don’t. The only criterion for deciding this is reliability testing. It’s not enough just to say, “Well, this should work, we’ve taken everything that seems to be influencing the results into account”. If your model still isn’t reliable, then you haven’t accounted for all the complexity – or else you’ve just gone wrong somewhere. Or both. It can be extremely hard to know where the fault lies.

In fact, given the repeated failure of epidemiological models, it seems most likely that a lot of relevant and important factors are being left out. Threatening diseases often just “burn out” quicker than epidemiologist modellers expect, so it’s likely that their models have failed to account for some of these other factors.

For instance, recent research has claimed that many people may have partial or full immunity to COVID-19 due to past encounters with other coronaviruses.

This no doubt applies to a great many new diseases: some proportion of people will have full or partial immunity already. But this is not something that can be easily modelled – at least not without a great deal more information than we currently have.

Epidemiological modellers also try to incorporate knowledge about genetics into their models, but our knowledge of how diseases interact with our differing genetics is still fairly rudimentary. We also know little about why some diseases affect children more than others. Furthermore, our knowledge of how viruses survive in various different environments and temperatures is very incomplete. Without better information about such factors being incorporated into the models, why should we continue to trust them after their many failures?

In the absence of such specific information, one thing modellers could do is to feed the information about their past failures back into their models. If your model consistently overestimates death numbers by, say, 100%, then at the very least you should be changing your model so that it adjusts its predictions 50% downwards every time; but of course it would be extremely embarrassing for any modeller to add in any such “repeated past failure” module into their model, and most likely there is no consistent mathematical pattern to their failures anyway.

The Sensitivity to Inputs

Yet another difference between bridge-building and epidemiological models concerns the issue of inputs. With an epidemiological model, small changes in input data can produce output results which are, for our purposes, vastly different, as Simon Anthony demonstrated recently in an article at Hector Drummond Magazine. The difference between a prediction of a disastrous epidemic or a normal winter virus can depend on small differences in the data that goes into the model. (This may recall to mind all those books and articles on chaos theory from the 1980s, where massive differences can result from small changes in the initial circumstances.)

This issue does not affect bridge-building models to the same extent. Bridges can now be made stable and strong (which is what we want) under a usefully large range of circumstances. This is partly because a bridge is something we are designing and building ourselves to exhibit behaviour we desire and using well-understood materials, whereas an epidemic is a part of nature that we do not have much control over, and also because we still do not understand disease spread that well – other than in the most basic ways.

That is not to deny that there will still be some scenarios – such as those involving extreme weather – where the bridge model will exhibit non-linearities; but in general, the outputs of standard epidemiological models are vastly more sensitive to their inputs than engineering models.

Wrong input data

This brings me to my final point of difference between bridge-building models and our COVID-19 models, and that is the issue of incorrect data being put into the model. This is an enormous problem with epidemiological models, whereas it isn’t a problem to anywhere near the same extent with bridge-building models (although it does sometimes happen, but not as a matter of course).

(Strictly speaking, this is not a fault of the model itself, but since it concerns the overall attempt to model the future of diseases such as COVID-19, I am going to include it here.)

It doesn’t matter how good a model is: if you are putting incorrect data into it, you’re going to be producing incorrect results. You can’t just assume that whatever results come out are going to be good enough for now, because as we have seen with epidemiological models, different data can produce vastly different results.

I won’t labour this point – it is one that has been made many times – except to note that despite about six months having passed since the first COVID-19 case was identified in China, there is still great disagreement over both the disease’s infection fatality rate (IFR), and its transmission rate (R_o) in various scenarios, and these are the critical numbers an epidemiological model requires to have any chance of being accurate. Ferguson could have created (despite appearances) an absolutely perfect model for COVID-19, but that will be of no use to anyone if the wrong numbers are going into it, as appears to be the case.

In fact, this problem cuts even deeper than many appreciate. With many diseases we never get a proper handle on what the real death rate and the transmission rates are. For instance, we still don’t really know much about the Spanish flu. We don’t know that much about how it spread and how many people were really infected. We don’t even know whether the different waves of it were caused by the same virus. And if you don’t trust my word on this, take Anthony Fauci’s word for it.

Even with modern influenza we have to rely on very crude estimates of how many people died with it. Even when we have the leisure to take a close look at some recent epidemic in order to improve a model, we often can’t put any solid, definitive numbers into it, because they just don’t exist. In fact, some of the time we’re using the models themselves to estimate how widely a disease spread, and what the transmission and fatality rates were. It’s not surprising, then, that it’s very difficult even now to create epidemiological models that work well enough to be trusted in difficult situations like the one we face with COVID-19.

Do all the top experts who you’d think love modelling really love modelling?

I leave you with the words of two of the worlds’ most high-profile epidemic scaremongers, who have both been at it since the days of the hysteria over AIDS. It seems that these two have finally started, after many decades, to get an inkling that epidemiological models are more dangerous than useful. (I owe these spots to Michael Fumento.)

The first is the Director of the USA’s Centre for Disease Control (and Administrator of the Agency for Toxic Substances and Disease Registry) Dr Robert Redfield, who said on April 1^st that COVID-19 “is the greatest public health crisis that has hit this nation in more than 100 years”.

A week later, though – as it started to become clear that the models had once again oversold a disease threat – he said, “Models are only as good as their assumptions, obviously there are a lot of unknowns about the virus. A model should never be used to assume that we have a number.”

Consider also Dr Anthony Fauci, the long-term director of the National Institute of Allergy and Infectious Diseases, and the very man who has been telling Donald Trump that the USA has a disaster on its hands. He said, for example, during a hearing of the House Oversight Committee on March 12^th, that COVID-19 “is ten times more lethal than the seasonal flu”.

But the fact that Fauci is very worried about COVID-19 does not mean that he thinks the models are gospel. A few weeks later he was reported as saying, “I’ve looked at all the models. I’ve spent a lot of time on the models. They don’t tell you anything. You can’t really rely upon models.”

If Robert Redfield and Anthony Fauci are not great believers in epidemiological models, I don’t see why the rest of us should be either.

Hector Drummond is a novelist and the author of Days of Wine and Cheese, the first novel in his comic campus series The Biscuit Factory. He is a former academic and the editor of Hector Drummond Magazine. He tweets at hector_drummond.

To join in with the discussion please make a donation to The Daily Sceptic.

Profanity and abuse will be removed and may lead to a permanent ban.

39 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Djaustin

4 years ago

Engineering models will normally take a “safety factor” of ten fold to ensure that even if their models are off, and systems stressed, the structure can cope. They also predict in a linear space rather than exponential. Taking that ten-fold safety factor on an exponential process, and that upper bound of 500k deaths does not look so far from the truth of 50k excess deaths so far.

-1

Engineer

Reply to Djaustin

Really, 10-fold? Source? Modeling for United Launch Alliance rockets have a 1.1 or 1.2 safety factor at most. Costs don’t allow for a 10x safety factor.

Hector Drummond

Clearly engineering models build in safety factors. How much will vary from field to field, depending on such things as the expense involved, and the perceived need for safety based on track records. It’s obviously not that expensive (relatively speaking) to have a safety factor of ten-fold with a bridge, but with a rocket (as commentator Engineer says) it’s very expensive, and risks making the whole thing impossible. In addition it can be noted that astronauts accept that rockets aren’t ultra-safe, but commuters driving their cars over a bridge every day wouldn’t be happy unless the bridge is almost 100% guaranteed not to collapse.

Anyway, applying this idea to epidemiological models is insane. Building a ten-times ‘safety factor’ into a bridge-building model requires us to spend a bit more money to make the bridge stronger, but not a ruinous amount, whereas building a ten-times ‘safety factor’ into an epidemiological model can be the difference between a mild flu and the black death. And it’s hardly a ‘safety factor’ if the cost is the difference between modern prosperity and medieval squalor and starvation. As I have been pointing out at Hector Drummond Magazine for months, ‘better safe than sorry’ applies to the economy as much as it does to disease. This would be like selling your family into slavery on a farm just to avoid the risk of traffic in your city suburb.

Reply to Hector Drummond

The “safety factor” is the range of the prediction interval in an exponential prediction process. Few engineering models predict in the exponential space. Your comments are founded on ALL epidemic models, not just micro sim ones. Such general models (eg SIR) have a 90 year track record in prediction. They have relatively few parameters and simplify to simple exponential grown during the epidemic phase, but they are sensitive to initial condition (which manifest as a time lag to a given number of case). Even the simplest models showed that in early March global cases and deaths and individual country cases and deaths were doubling every 3 days (deaths in the U.K. were actually doubling every 2 days). Hence the prediction of time to healthcare collapse (T = 3 days log(Nbeds/N0)/log(2)).

Robust decisions are insensitive to precise model assumptions and parameters. This was a robust prediction. That we would meet the threshold relatively quickly (as had other countries with better healthcare). Whether the cumulative total is 50k or 500k is actually a distraction to whether we would reach the Nbeds threshold in three weeks or four. Without intervention, we would reach that threshold. That was the prediction.

If gravity worked as an exponential rather than inverse square law, rockets would be using a 10-fold safety factor.

A bizarre response. The models predicting healthcare collapse were wrong. Even in Sweden the models were out by miles. It makes a huge difference whether deaths are doubling every, say, 4 days or every 8 days.

See this article for discussion on this matter in relation to Sweden: https://hectordrummond.com/2020/05/14/simon-anthony-why-were-the-predictions-of-the-swedish-model-so-wrong/

The model discussed in this article thought that deaths in Sweden were doubling every 3 to 5 days. They weren’t. Looked like it was more like every 7.5 days. This resulted in an enomous difference between predicted deaths (even taking into account the Swedish government’s policies).

Also, the deaths in the UK were not doubling every two days (or every three days for that matter). You can see that on Worldometers. A doubling every two days would see deaths going from 1 a day to 32 000 a day within a month, and a quarter of a million a day a week later. Epidemiological models often make ridiculous predictions like this because they don’t take sufficient note of the fact that real epidemic curves never have this sort of exponential growth for as long as models think.

djaustin

https://www.medrxiv.org/content/10.1101/2020.04.09.20059402v1.full.pdf+html Figure 1. Cases were doubling every three days. Deaths every two days. Tested against all other countries rebased to the same point in the epidemic. Currently undergoing peer review.

UK data for COVID19 incidence of new cases and deaths 9 days apart at lockdown:
14 March 149 cases, 15 deaths
23 March 967 cases, 264 deaths

Cases doubling time = 3.3 days from 9(days) x log(2)/(log(967/149)
Deaths doubling time = 2.2 days.

At lockdown cases were doubling every three days and deaths were doubling every two days. One does not need a sophisticated projection model to make further inference. I have some agreement with the basic premise of the article, that some of the models are sensitive to parameters and projections may therefore be unreliable. The gravity of the situation may have been masked by the “headline” death count. But even the simplest projection above using data-driven methods (i.e., NO assumptions nor parameters), gave equally dire near-term projections.

The reason “real” epidemic curves never have this sort of exponential growth is because the SARS-COV-2 situation is unprecedented. Other infections have either pre-existing population immunity (e.g., influenza by past infection or annual vaccination), or effective treatment/prophylaxis to reduced generation time. Without these measures, other actions are necessary. It is why there was so little variation between countries in the initial doubling time.

Barney McGrew

Reply to djaustin

You’re assuming that we know how long it took cases to double. We didn’t. The number of cases was also linked to the number of tests carried out – it’s as mundane as the availability of test kits. This is a prime example of why models based on this kind of information are wrong, wrong wrong.

The extreme, messy distortion of the data, changing over time in arbitrary ways is completely ignored by people like yourself who think that they’ve got some ‘data’ to play with. And indeed the table of figures looks beautiful, and maybe the curves look kind of convincing. But you should always be thinking “This data was compiled by nurses in hospitals, and overworked doctors who didn’t have enough test kits to work with. And then they were told to change the way they recorded deaths after a couple of weeks. And then they swapped to a more reliable brand of test kit. And then they were given training on how to use them. And then they started testing people with less severe symptoms.” etc. etc.

Reply to Barney McGrew

Deaths reported to the ecdc follow a standard protocol for all countries. That data unequivocally showed the doubling time in China predicted the doubling time in Italy, predicted the doubling time in Spain. On log scales, the slope is what matters. Hand this corrects for case acquisition differences and fatality rates) The data was unequivocal: at the same point in every epidemic (say 64 cases or 4 deaths), all countries had the same doubling times for cases and deaths. Deaths in the U.K., however, were growing faster than elsewhere and faster than cases. Why might that be? Those are facts not based on models. That makes for robust decisions insensitive to assumptions.

Don’t you remember this sort of thing in March?
https://www.businessinsider.com/coronavirus-uk-estimates-up-to-10000-british-people-already-infected-2020-3?r=US&IR=T

“UK officials believe up to 10,000 people are already infected with the coronavirus in Britain – 20 times higher than the officially confirmed number.

The number of confirmed cases is 590, but many people carrying the virus are likely not to have been tested.

‘It’s much more likely that we’ve got somewhere between 5,000 and 10,000 people infected at the moment,’ the UK’s chief scientific advisor Patrick Vallance told a press conference on Thursday.”

So where’s your beautiful theory in the middle of that?

I did not need to propose a theory. I stated that deaths, recorded by standard means in hospital, were doubling every two days. Detected cases doubling every three days (universally across all countries – including Germany btw). Even missing 19/20 cases does not refute that premise in a 66M population. The law of large numbers and modest healthcare resources means it is simple to extrapolate. Doubling from 10000 true cases, 500 observed and say 50 deaths has the same effect. It would be 13 doubling times without immunity, this would of course slow down as infected people become harder to find. But the eventual fatality sum would still be 330k deaths on those simple ratios.

You don’t need sophisticated models, just and understanding of logarithms and diminishing returns with limited resources (people to infect). Sadly these sorts of sums were not communicated at the time – just “our models predict…”. Robust decisions are insensitive to assumptions.

If the testing was random and maintained that way continuously then that would be one thing, but in the UK certainly it wasn’t. Same thing for the deaths – they changed how they were recorded as time went on.

And both deaths and ‘cases’ were biased towards the spread of the disease in hospitals – an artificial situation. Early on they were putting people on ventilators and killing them that way, and then they realised that it was better just to give people oxygen. Etc. Etc.

But a table of numbers doesn’t convey all this. And then from this table of numbers is derived ‘R0’ and ‘IFR’ etc. All misleading; all wrong.

I don’t disagree with any of those points. Nor have I mentioned R0 or IFR. The facts are that cases were doubling every three days. Now some proportion of those cases will have required hospitalisation. And some of those will die. In the U.K., deaths were accelerating faster than cases (in contrast to all other countries). What I contend is that healthcare would be swamped, as it was in countries with better healthcare. Where we are now is down to early reticence to take the matter seriously. Germany and Greece are examples where early acceptance of the severity of the situation by the population led to behavioural changes (before lockdown and before testing). The doubling time on the way up is three days, but the halving time on the way down is 10 days. So a day’s delay means a week longer control. Once daily incidence is down, Sweden shows what steady-state looks like (they have low incidence and R=1).

BTW deaths reported to ECDC by the U.K. and all governments are standardised. They may have changed how they reported to the public (including nursing homes etc) but the ECDC data that you will see on worldometer is consistent.

Tim

The problem here is a misunderstanding of chaos theory. You state “But even the simplest projection above using data-driven methods (i.e., NO assumptions nor parameters), gave equally dire near-term projections”
But you are using cases and deaths. Are those figures correct especially at a time of widescale panic when all the usual biases and foibles of people are magnified? Never mind the debates about testing accuracy or deaths “of” or “with”.
At any given time you do not know if any of your data is “the real number” but how do you know either the cases or deaths figure is the “real figure”. You do not. You assume they are correct apply a cute mathematical formulae and then you extrapolate from there.
In my opinion this is the real issue with modelling and why it is so inaccurate. The world is extremely complex and dealing with more than three parameters gives bizzare results (chaos theory). It is not that Ferguson missed a couple of things that would have effected his outcomes. It is that he will have missed thousands or millions some large some small, each echoing out into the future and shaping it in completely unpredicted ways.
With all due respect the engineers don’t necessarily do better. How often do engineering projects come in on budget? The budget is a function of their modelling. The builders start digging and find the ground in one area is not the same as the samples they took to feed their models. So they have to add support or use a different material. No drama but still, not predicted. The larger the project the more budgets blow out, look at HS2 or any Olympic project. Hardly an advert for accurate engineering modelling are they?
Epidemiology suffers from the same problem as economics. It wants to be a science but it is plainly not as much of its results are not repeatable, nor do they make useful real world predictions. This is combined with a hubistic view of mathematics. It may well be the language of the universe, but it does not account for the human condition nor is there a mathematical formula to tell you what you didn’t think of.

TechnicalBard

There are engineering models that deal with non-linear problems, and even exponential ones. Control systems in chemical plants are an example of situations where situations can become exponential. Predicting that and preventing it is critical to avoid disaster. And in the engineering spaces I live in, the largest safety factor is about 3.5, while most are significantly smaller (1.1, 1.2 are common).

Designing with excess safety factors is not economically viable.

Reply to TechnicalBard

I would view bounds of 0.33-3x as a reasonable prediction bound for a highly nonlinear process. In my world, 10x safety cover is the norm to account for individual rather than population behaviour. Sometimes it may be higher if the concern is high and the event unmonitorable. That certainly can prove very restrictive.

Jeremy

I’m a Mathematician by training but my cousin is a Nuclear Physicist who specialises in Fission Reactors. We’ve had many interesting discussions about the models he uses when modelling the radioactive isotopes under changing power loads and their interactions within a reactor core, many of which grow and decline exponentially, and despite the interactive complexity can be modelled highly accurately.

I think what DJ Austin is missing is that while my cousin could design reactor tests that require less down time and have fuel rods on the very edge of melting (1.1x margin of safety) he instead designs them with truly massive margins of safety given how staggeringly one sided the risk / reward is.

In this case the risk / reward was staggeringly in the favour of keeping the economy going no matter what, this was intuitively obvious to any one who had any prior understanding of the devastating health and mental effects economic downturns have.

Reply to Jeremy

I’m also a mathematician btw (Theoretical Physicist originally).

One can always argue the risk/reward part, but my contention stands – even simple data extrapolation (not sophisticated models) predict poor outcome for the health service. If protection of that resource is the aim, then early intervention would be required. The absolute number of deaths is inestimable early on (hence 0.3-3x likely range would be as good as it gets). I would not have proposed such a figure at that stage, not until some inflection in case and death incidence (there is just no information for parameter estimation).

Non-linear stochastic biological process are less predictable than fission or chemical reactions in the early stages of an epidemic (there many many orders of magnitude more atoms/molecules than humans). But once the exponential process has started, mass-action is a pretty good description, and epidemics become predictable (just like chemical reactions).

Daniel King

Except when they don’t. Because s**t in is always equal to s**t out.

Yes, well protecting the health system at the expense of everything else was always ridiculous. I don’t know if you do, but I guess if you think this was / is a good idea we will always by taking past one another

I remember thinking within a minute or two of realising that this pandemic was going global, that it was going to be terrible watching the health system here in New Zealand collapse and watch people dying in the inevitable field hospitals (and hoping no one in my family or I ended up in one) but I consolled myself that this the worst the virus could do, we would mourn the dead and the health system would recover in a few years.

I never for a second thought our government would lockdown a healthy population and close our borders indefinitely and risk collapsing our economy – because it is so fundamentally irrational on a risk / reward basis. Even more so as we learn just how benign this virus is compared to the early fears.

This is why I thought from the out set that the models being presented should not include 10x scenarios, and I publicly called for the resignation of the modelling team in New Zealand that modelled an 80,000 death scenario, when we already had good data sets to give very high confidence of an upper bound of 0.4% IFR and susceptibility in the population of 60%. This was 2 months ago and IMO it wasn’t science, but fear mongering and political pandering of the worst kind that is likely going to cause thousands of excess death of relatively young people here in NZ and corrode faith in science. The maths for these upper bounds with high degrees of confidence is so easy middle schoolers could do it (with proper mathematicians to provide the statistical rigor) yet our top scientists failed to apply ANY proper vetting of inputs and our politicians failed to do these most basic of sums.

Mars-in-Aries

I think a factor of two would be considered very generous in such engineering models, which in any case are completely deterministic without any random element – unlike epidemiological modelling.

Another engineer

10? Nonsense. And neither linear. I wouldn’t even suspect a majority of engineering considerations were linear. And frankly if they aren’t linear I would take exponential or logarithmic as a good second. Rather that than many real world engineering problems expressed by complex polynomials, with several points of inflection across a range and apparently unpredictable outcomes.

Dr RTFM

“Engineering models will normally take a “safety factor” of ten fold to ensure that even if their models are off, and systems stressed, the structure can cope.”

Not exactly; first, engineering models are validated against reality, not against other models. If your model isn’t validated, it won’t be used. Second, the safety factor used depends on the engineering disipline and situation. Ten-fold would be unusual.

How do I know this? I am both a practicing engineer and professor, and have been such for a quarter of a century.

TyRade

I worked in HM Treasury 1980 to 1984, spending much of that time ‘servicing’ part of the Treasury Model of the UK economy. Memory naturally fades a bit after all this time, but Hector’s piece jogged the following:
– there was a ‘small version’ of the model (reduced form as opposed to structural form in the big Kahuna) and I think it performed more than passably. Ie there were diminishing returns to adding layers/equations simulating ‘reality’.
– most meetings of ‘service engineers’ involved discussing ‘the residual’ – equations kept breaking down all over, requiring what we now term ‘patches’, much as a car mechanic will fix an oil leak; by ‘setting the residual’ a temporary fix was applied that got the model through the current ‘forecasting round’. Of course, these fixes were rarely temporary and often became very complex (feeding off other equations bolted on at the last minute; ‘type 2’ fixes).
– reliability testing, for all the equations in the model, involved projecting beyond the estimation period. This was especially tough when an equation estimated in ‘normal’, calmer times was run on to a crisis period.

Morals for epidemiological models?
– when the data is mushy, simpler models are best? Ie don’t bring a Mission Control edifice to fly a dodgy glider.
– admit the patchwork openly. Educate the public, never mind the politicians, about the confidence to be put in this Airfix-ery.
– the ‘out of estimation period’ reliability tests of most episodes-models appears to suck.

Enough to make even a supposed ‘vulnerable’ citizen want to end the lockdown, now.

Reply to TyRade

sorry, meant ‘epidemiological models’ near end

Anonymous

“While I regard the attempt to capture the complexity of the real world as admirable, there can be no end to it when modelling some of the messier parts of reality, and you can end up needing a model as large as reality itself before your model starts to give you any reliable predictions.”

Apparently the model has around 450 parameters, which means 450 possibilities for GIGO.

physicist137

Reply to Anonymous

Yes, this is one of the better articles certainly, but there is still much more to be said about the uncertain meaning of modelling and fitting (let alone predicting) when one gets beyond 4-5 parameters, let alone up to 50 or 450.

To name just one implication: With 400+ parameters to work with, any single parameter might be constrained to some ridiculous value without much affecting goodness-of-(back)fit, but that one ridiculous parameter value could greatly distort results, say, of taking/fixing all those parameters and then estimating marginal sensitivities (e.g., to model-predicted implications of specific non-pharmaceutical interventions). … Unless you sample for many possible parameter choices for goodness-of-fit, you cannot sample over that sample to get a range of estimates for effects-at-the-margin, which was the purported use of the model in question.

For a non-numeric analogy, imagine being in a company of 450 people and no written policies, and then trying to predict the effect of a memo purporting to clarify one policy. … Yes, this gets back to needing ‘a model as large as reality.’ (Or, the fact that one is in a going concern leaves it under-constrained what all the individual roles might be,)

chas cowie

This is one of the best critiques of Ferguson’s models I have seen. By all accounts the coding was shoddy but the real crime was that there were no design documents that you could use to evaluate whether the model was fit for purpose. In all my years in the IT industry (Since 1969) we never built anything without design documentation to ensure that we were doing the right thing and to allow others (non IT technical) to comment on what was proposed. As the article shows Ferguson’s model was a bit of the wild west that should have had no part in a calamitous decision to destroy our economy.

BillT

Drummond is making a serious mistake in conflating C19 epidemiological models with those developed to predict the long term effects of anthropogenic climate change. The C19 model apparently followed by the government was developed on the basis of a flu pandemic using worst case variables plucked from the air. This is hardly surprising, since C19 is a new disease and you have to start somewhere. But the model hasn’t been updated since its inception despite the clear indications that the age profile of C19 infection nowhere near resembles that of flu, the incubation time is different and the death rate isn’t as bad as first assumed.
Climate models have been around for more than 30 years, and there isn’t a single model but several. In contrast to C19, data and variables are updated regularly and the consensus is that, if anything, the models err on the side of caution as temperatures seem to be rising faster and with greater impact than the models suggest.
Just because the basis of one model is shoddy, don’t assume all models are equally poor, especially in completely different fields of research.

-2

Philip Aggrey

Reply to BillT

“ and the consensus is that, if anything, the models err on the side of caution as temperatures seem to be rising faster and with greater impact than the models suggest.”

What utter rubbish. All the models run hotter than real unadjusted Temps. https://news.ucar.edu/132678/ncars-new-climate-model-running-hot

Reply to Philip Aggrey

I’m afraid you have to read the articles you post and, if possible, understand them.

First, the article refers to one model, so your comment that “ALL the models run hotter than real unadjusted Temps” is misplaced.

Second, the article specifically refers to the effect of a doubling of CO2 on the temperature. We haven’t got there yet, though we’re working on it. Don’t assume that there is a linear relationship between CO2 concentration and surface temp.

Third, and most telling, the article states:

“The scientists who developed CESM2 tested the model repeatedly during model development to make sure that it matched the observed climate of the last century and a half. However, when they switched from an older emissions dataset to the new one, the resulting simulations did not warm the climate in the 20th century as much as observed.”

That is, new emissions data caused the models to underperform: the observed warming was GREATER than the SIMULATIONS.

You are wrong about the Imperial model not taking into account the characteristics of C19 including age profile, incubation period, fatality rate and so on. It incorporates all of those, and these were adjusted to reflect C19 to the best of their ability using WHO figures, etc.

The fundamental problem is that the Imperial model is, at heart, a SIR model and so does not allow for the possibility that people may have innate resistance, or that viral load may influence the severity of the illness and its infectiousness. This is probably the explanation for why these models never seem to reflect reality! But they still keep using them.

One aspect is that if you are going to quote an R0, you are talking about a SIR model. If you were to develop a more subtle, realistic model, there could be no such thing as R0. It seems that the need for an R0 figure to quote means that these people are stuck in Groundhog Day, forever doomed to keep making the same mistake over and over again.

Robert Seddon

The section on grant-funding initiatives reminded me of this piece on ‘grimpact’ from last year: https://blogs.lse.ac.uk/impactofsocialsciences/2019/05/28/grimpact-time-to-acknowledge-the-dark-side-of-the-impact-agenda/

The politicians who oversaw the introduction of the ‘impact agenda’ presumably won’t be called to answer for its weaknesses either.

“…… the sort of modelling Ferguson is doing is not discredited by the fact that his own effort was so shonky, because any number of epidemiological modellers could have come up with similar analyses using impeccable code. ”

Shonky…. Love it!

But no, the Ferguson model is discredited because other competent computer specialists have been unable to reproduce Ferguson’s results using his own (cleaned up) code! The question is also raised about the quality of the code of other academic epidemiologists.

Ferguson’s code is not “maths speeded-up”. It is a simulation which tracks the progress of a disease through a virtual population. It turns out that even the random number generator used to decide if some virtual person caught the disease or not was not actually random, so skewed the results in some unknown way.

The real point is that for codes like this, unless they are rigorously tested using known methods in computer modelling, you cannot give any credibility to the results. It is quite evident that Ferguson’s model was not so tested and so why was such credibility given to the results…?

Dodgy Geezer

“…(and disturbingly, this article mentions that Ferguson and his team had previously predicted 500,000 cases)….”

I think I can identify one of the sub-routines in his code:

IF:
Issue is highly topical OR IF politicians are interested
AND money is available to fund research
AND high predictions will be beneficial to my epidemiological work
THEN:
predict 1/2m deaths.

RICHARD Robbins

he got it wrong with the mad cow virus needlessly slaughtering thousands of animals. He always gets it wrong. Its time we got rid of him

António Antunes

One comment about epidemiological models. The most recent economic literature embodies epidemiological parts, namely de so-called SIR approach, in economic models. These models are very different from normal epidemiological models in that they endogenously account for behavioural changes. Sérgio Rebelo, the prominent Northwestern University economist, readily acknowledges that epidemiological models always overestimate the evolution of any epidemic due to this fact. And indeed there is one thing about economic models that fundamentally distinguishes them from, say, civil engineering models: part of the optimisation of economic agents is about the future, whereas in physics it’s all about initial conditions and control. This does not seem much but has profound implications and, in my view, makes epidemiological models basically useless. The intuition is that, if people are rational, they’ll optimise behaviour so as to balance the benefits of future economic activity with the possibility of future contagion. This is much different from the assumptions of epidemiological models, where behaviour is always the same. To try and remedy this model, epidemiologists often rely on a changing R, but this is little more than predicting the past. So I think one should embody behavioural elements in a non as hoc way in epidemiological models and use rationality of individuals to get the solution. This makes solving these models much more difficult but at least less arbitrary.

paul brunner

One thing I’ve noticed, both here in Australia and in the UK is the number of epidiemologists who have crawled out of the woodwork in the last 6 months.
Where have all these people been and what have they been doing before COVID? And how much notice should we really take of mos (but not all) of them.
Talk about being unnaccountable – nearly as bad as being a public health official (are they mostly doctors who couldn’t cope with the commercial world)