Second Analysis of Ferguson's Model

Second Analysis of Ferguson’s Model

by Sue Denim

I’d like to provide a followup to my first analysis. Firstly because new information has come to light, and secondly to address a few points of disagreement I noticed in a minority of responses.

The hidden history. Someone realised they could unexpectedly recover parts of the deleted history from GitHub, meaning we now have an audit log of changes dating back to April 1st. This is still not exactly the original code Ferguson ran, but it’s significantly closer.

Sadly it shows that Imperial have been making some false statements.

ICL staff claimed the released and original code are “essentially the same functionally”, which is why they “do not think it would be particularly helpful to release a second codebase which is functionally the same”.

In fact the second change in the restored history is a fix for a critical error in the random number generator. Other changes fix data corruption bugs (another one), algorithmic errors, fixing the fact that someone on the team can’t spell household, and whilst this was taking place other Imperial academics continued to add new features related to contact tracing apps.

The released code at the end of this process was not merely reorganised but contained fixes for severe bugs that would corrupt the internal state of the calculations. That is very different from “essentially the same functionally”.
The stated justification for deleting the history was to make “the repository rather easier to download” because “the history squash (erase) merged a number of changes we were making with large data files”. “We do not think there is much benefit in trawling through our internal commit histories”.

The entire repository is less than 100 megabytes. Given they recommend a computer with 20 gigabytes of memory to run the simulation for the UK, the cost of downloading the data files is immaterial. Fetching the additional history only took a few seconds on my home WiFi.

Even if the files had been large, the tools make it easy to not download history if you don’t want it, to solve this exact problem.

I don’t quite know what to make of this. Originally I thought these claims were a result of the academics not understanding the tools they’re working with, but the Microsoft employees helping them are actually employees of a recently acquired company: GitHub. GitHub is the service they’re using to distribute the source code and files. To defend this I’d have to argue that GitHub employees don’t understand how to use GitHub, which is implausible.

I don’t think anyone involved here has any ill intent, but it seems via a chain of innocent yet compounding errors – likely trying to avoid exactly the kind of peer review they’re now getting – they have ended up making false claims in public about their work.

Effect of the bug fixes. I was curious what effect the hidden bug fixes had on the model output, especially after seeing the change to the pseudo-random number generator constants (which means the prior RNG didn’t work). I ran the latest code in single threaded mode for the baseline scenario a couple of times, to establish that it was producing the same results (on my machine only), which it did. Then I ran the version from the initial import against the latest data, to control for data changes.

The resulting output tables were radically different to the extent that they appear incomparable, e.g. the older code outputs data for negative days and a different set of columns. Comparing by row count for day 128 (7th May) gave 57,145,154 infected-but-recovered people for the initial code but only 42,436,996 for the latest code, a difference of about 34%.

I wondered if the format of the data files had changed without the program being able to detect that, so then I reran the initial import code with the initial data. This yielded 49,445,121 recoveries – yet another completely different number.

It’s clear that the changes made over the past month and a half have radically altered the predictions of the model. It will probably never be possible to replicate the numbers in Report 9.

Political attention. I was glad to see the analysis was read by members of Parliament. In particular, via David Davis MP the work was seen by Steve Baker – one of the few British MPs who has been a working software engineer. Baker’s assessment was similar to that of most programmers: “David Davis is right. As a software engineer, I am appalled. Read this now”. Hopefully at some point the right questions will be asked in Parliament. They should focus on reforming how code is used in academia in general, as the issue is structural incentives rather than a single team. The next paragraph will demonstrate that.

Do the bugs matter? Some people don’t seem to understand why these bugs are important (e.g. this computational biology student, or this cosmology lecturer at Queen Mary). A few people have claimed I don’t understand models, as if Google has no experience with them.

Imagine you want to explore the effects of some policy, like compulsory mask wearing. You change the code and rerun the model with the same seed as before. The number of projected deaths goes up rather than down. Is that because:

The simulation is telling you something important?
You made a coding error?
The operating system decided to check for updates at some critical moment, changing the thread scheduling, the consequent ordering of floating point additions and thus changing the results?

You have absolutely no idea what happened.

In a correctly written model this situation can’t occur. A change in the outputs means something real and can be investigated. It’s either intentional or a bug. Once you’re satisfied you can explain the changes, you can then run the simulation more times with new seeds to estimate some uncertainty intervals.

In an uncontrollable model like ICL’s you can’t get repeatable results and if the expected size of the change is less than the arbitrary variations, you can’t conclude anything from the model. And exactly because the variations are arbitrary, you don’t actually know how large they can get, which means there’s no way to conclude anything at all.

I ran the simulation three times with the code as of commit 030c350, with the default parameters, fixed seeds and configuration. A correct program would have yielded three identical outputs. For May 7th the max difference of the three runs was 46,266 deaths or around 1.5x the actual UK total so far. This level of variance may look “small” when compared to the enormous overall projections (which it seems are incorrect) but imagine trying to use these values for policymaking. The Nightingale hospitals added on the order of 10-15,000 places, so the uncontrolled differences due to bugs are larger than the NHS’s entire crash expansion programme. How can any government use this to test policy?

An average of wrong is wrong. There appears to be a seriously concerning issue with how British universities are teaching programming to scientists. Some of them seem to think hardware-triggered variations don’t matter if you average the outputs (they apparently call this an “ensemble model”).

Averaging samples to eliminate random noise works only if the noise is actually random. The mishmash of iteratively accumulated floating point uncertainty, uninitialised reads, broken shuffles, broken random number generators and other issues in this model may yield unexpected output changes but they are not truly random deviations, so they can’t just be averaged out. Taking the average of a lot of faulty measurements doesn’t give a correct measurement. And though it would be convenient for the computer industry if it were true, you can’t fix data corruption by averaging.

I’d recommend all scientists writing code in C/C++ read this training material from Intel. It explains how code that works with fractional numbers (floating point) can look deterministic yet end up giving non-reproducible results. It also explains how to fix it.

Processes not people. This is important: the problem here is not really the individuals working on the model. The people in the Imperial team would quickly do a lot better if placed in the context of a well run software company. The problem is the lack of institutional controls and processes. All programmers have written buggy code they aren’t proud of: the difference between ICL and the software industry is the latter has processes to detect and prevent mistakes.

For standards to improve academics must lose the mentality that the rules don’t apply to them. In a formal petition to ICL to retract papers based on the model you can see comments “explaining” that scientists don’t need to unit test their code, that criticising them will just cause them to avoid peer review in future, and other entirely unacceptable positions. Eventually a modeller from the private sector gives them a reality check. In particular academics shouldn’t have to be convinced to open their code to scrutiny; it should be a mandatory part of grant funding.

The deeper question here is whether Imperial College administrators have any institutional awareness of how out of control this department has become, and whether they care. If not, why not? Does the title “Professor at Imperial” mean anything at all, or is the respect it currently garners just groupthink?

Insurance. Someone who works in reinsurance posted an excellent comment in which they claim:

There are private sector epidemiological models that are more accurate than ICL’s.
Despite that they’re still too inaccurate, so they don’t use them.
“We always use 2 different internal models plus for major decisions an external, independent view normally from a broker. It’s unbelievable that a decision of this magnitude was based off a single model“

They conclude by saying “I really wonder why these major multinational model vendors who bring in hundreds of millions in license fees from the insurance industry alone were not consulted during the course of this pandemic.“

A few people criticised the suggestion for epidemiology to be taken over by the insurance industry. They had insults (“mad”, “insane”, “adding 1 and 1 to get 11,000” etc) but no arguments, so they lose that debate by default. Whilst it wouldn’t work in the UK where health insurance hardly matters, in most of the world insurers play a key part in evaluating relative health risks.

To join in with the discussion please make a donation to The Daily Sceptic.

Profanity and abuse will be removed and may lead to a permanent ban.

278 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Ron Smith

6 months ago

Not unrelated; some truth in that!
Here’s seven signs that show Western civilization could soon end

-2

transmissionofflame

Excellent- nutjob conspiracy theories about the decline and fall of our civilisation are now in the Daily Mail.

A similar thing that happened to the Yeoman farmers happened to WW1 Landowners who died in battle:https://www.youtube.com/watch?v=kmwsb9j4tWU&t=2s

Gezza England

Reply to Ron Smith

How long in weeks was the life expectancy of a lieutenant on the Western Front?

Reply to Gezza England

From what I heard shorter than the regular soldier. How is that relevant to the Royals hovering up their land?

Jack the dog

But, did they?

-1

huxleypiggles

https://x.com/CilComLFC/status/1878137821443072396

And here is a perfect example of wokeism being put out by the Welsh government.

I think most here would consider this enticement. Doubtless for the ROPers this is aimed at it is surely just a permission notice for when the RNLI drops them off. So it really, really, really is NOT their fault.

Reply to huxleypiggles

Absolutely sickening.

And more fool us for not hanging the sons of birches

They’re taking the mickey.

Talking of wokeism, Arsenal are playing in all white today to show red a red card – or some shit like that. I don’t know what moron designed the shirts – Stevie Wonder perhaps – given that the shirt numbers are also white with the thinnest of black outlines. and the reason is to virtue signal about knife crime in London. Obviously missed the knighthood given to a man who has seen a huge growth in knife crime on his watch.

I think Casey is basically right. The parallels are undeniable.

It cn be bloody hard to see the way back from the destruction we have allowed to be wrought upon our magnificent civilization.

The rape gangs scandal may well come to be seen as a watershed moment – in one way or the other.

Art Simtotic

As a science student at the time, viewing Dr Jacob Bronowski’s landmark 1973 BBC series “The Ascent of Man” was a formative scientific experience.

The series began with primitive man’s first migration from Africa’s Rift Valley, and in a 13-episode tour de force went on to explain how science and technology had been so fundamental to Western civilisation.

In the final episode, Dr Bronowski concluded by stating that the ascendancy of Western civilisation could not be taken for granted. Prophetic words begin at around 36 minutes in:

https://www.bbc.co.uk/iplayer/episode/p0g1kpd9/the-ascent-of-man-13-the-long-childhood

The good Doctor’s last testimony to the future, he died the following year.

Reply to Art Simtotic

I’m amazed it has not been more widely denounced as racist. Probably helps that some of his family were apparently killed in Auschwitz.

Reply to transmissionofflame

I’m inclined to think you can’t judge the past by the mores of the present, verities tend to outlast the temporal and -isms can be in the eye of the beholder.

I’d always assumed the death of his family at Auschwitz to be as told in an earlier episode of the series.

I am certainly not judging him, just suggesting that it’s likely sooner or later that his probable neglect of the “achievements” of various races and focus on the achievements of White people will make him a Bad Person in the eyes of the madleft as one of our fellow posters calls them.

I have no idea whether the Auschwitz story is true or not, just saying it might have bought him some brownie points (I did find one commentator who posited that JB might be a racist but dismissed it because his family were victims of the holocaust – illogical bollocks but there you go).

Bollox is as bollox says and does.

Last edited 6 months ago by Art Simtotic

Jabby Mcstiff

Wokeism is dead as dead as the green agenda. It might not look that way yet but all it takes is a certain tightening of material well-being and all of this fancy thought disappears very rapidly. You quickly understand that the people that you need to value are the people who offer the best chance of survival.

Reply to Jabby Mcstiff

I think the worm is turning but the war is not yet won.

We are at Alemein, not D Day.

Reply to Jack the dog

“Before Alamein we never had a victory, after Alamein we never had a defeat.”

There was Operation Market Garden though.

It has gone. Spengler talked about this stage, home economicus. The bankers say inflate or die. That means lots of war because it is the greatest inflationary pressure. The West is now severely overstretched militarily. Sadly a school of thought has grown up in the last 25 years among the Anglo-Americans that a nuclear war is essentially winnable. They really believe it and they talk about it openly now. If this permanent bureaucracy isn’t somehow dislodged very quickly then only blind and mindless destruction can follow.

There was a British study in 1985 if I recall. which looked at the effect of a single nuclear weapon hitting the UK. It concluded that even if all the hospital beds were empty the health system would still be completely overwhelmed by such an event. And that was at a time when the NHS was healthier than it is now. I strongly disagree with this doctrine of escalation to nuclear as if it is just another step.

I do believe there were parts of East England that had radiation contamination from the Chernobyl accident.

There is a good documentary, I forget the name about a huge undergoround site in Finland where they try to bury all the nuclear waste products. You can put it in the ground or under the sea but you can’t undo the dark magic. Can any civilisation survive when it has developed tools that can blow itself to pieces. The idea the nukes will be left on the shelf forever. I would say no destruction is assured. Maybe then we rise up again, as Plato said of the post-Atlantean period, like children with no recollection of what went before.

Radar521

We’re all doomed.

SomersetHoops

To underline what is happening, the corrupt legal system we have (refer to Paul Barret’s YouTube posts about Tommy Robinson for example) there are other examples, is another step towards the failure of our civilisation unless we can have legal proceedings accurately monitored and corrected when they go wrong.