by Sue Denim
[Please note: a follow-up analysis is now available here.]Imperial finally released a derivative of Ferguson’s code. I figured I’d do a review of it and send you some of the things I noticed. I don’t know your background so apologies if some of this is pitched at the wrong level.
My background. I have been writing software for 30 years. I worked at Google between 2006 and 2014, where I was a senior software engineer working on Maps, Gmail and account security. I spent the last five years at a US/UK firm where I designed the company’s database product, amongst other jobs and projects. I was also an independent consultant for a couple of years. Obviously I’m giving only my own professional opinion and not speaking for my current employer.
The code. It isn’t the code Ferguson ran to produce his famous Report 9. What’s been released on GitHub is a heavily modified derivative of it, after having been upgraded for over a month by a team from Microsoft and others. This revised codebase is split into multiple files for legibility and written in C++, whereas the original program was “a single 15,000 line file that had been worked on for a decade” (this is considered extremely poor practice). A request for the original code was made 8 days ago but ignored, and it will probably take some kind of legal compulsion to make them release it. Clearly, Imperial are too embarrassed by the state of it ever to release it of their own free will, which is unacceptable given that it was paid for by the taxpayer and belongs to them.
The model. What it’s doing is best described as “SimCity without the graphics”. It attempts to simulate households, schools, offices, people and their movements, etc. I won’t go further into the underlying assumptions, since that’s well explored elsewhere.
Non-deterministic outputs. Due to bugs, the code can produce very different results given identical inputs. They routinely act as if this is unimportant.
This problem makes the code unusable for scientific purposes, given that a key part of the scientific method is the ability to replicate results. Without replication, the findings might not be real at all – as the field of psychology has been finding out to its cost. Even if their original code was released, it’s apparent that the same numbers as in Report 9 might not come out of it.
Non-deterministic outputs may take some explanation, as it’s not something anyone previously floated as a possibility.
The documentation says:
The model is stochastic. Multiple runs with different seeds should be undertaken to see average behaviour.
“Stochastic” is just a scientific-sounding word for “random”. That’s not a problem if the randomness is intentional pseudo-randomness, i.e. the randomness is derived from a starting “seed” which is iterated to produce the random numbers. Such randomness is often used in Monte Carlo techniques. It’s safe because the seed can be recorded and the same (pseudo-)random numbers produced from it in future. Any kid who’s played Minecraft is familiar with pseudo-randomness because Minecraft gives you the seeds it uses to generate the random worlds, so by sharing seeds you can share worlds.
Clearly, the documentation wants us to think that, given a starting seed, the model will always produce the same results.
Investigation reveals the truth: the code produces critically different results, even for identical starting seeds and parameters.
I’ll illustrate with a few bugs. In issue 116 a UK “red team” at Edinburgh University reports that they tried to use a mode that stores data tables in a more efficient format for faster loading, and discovered – to their surprise – that the resulting predictions varied by around 80,000 deaths after 80 days:

That mode doesn’t change anything about the world being simulated, so this was obviously a bug.
The Imperial team’s response is that it doesn’t matter: they are “aware of some small non-determinisms”, but “this has historically been considered acceptable because of the general stochastic nature of the model”. Note the phrasing here: Imperial know their code has such bugs, but act as if it’s some inherent randomness of the universe, rather than a result of amateur coding. Apparently, in epidemiology, a difference of 80,000 deaths is “a small non-determinism”.
Imperial advised Edinburgh that the problem goes away if you run the model in single-threaded mode, like they do. This means they suggest using only a single CPU core rather than the many cores that any video game would successfully use. For a simulation of a country, using only a single CPU core is obviously a dire problem – as far from supercomputing as you can get. Nonetheless, that’s how Imperial use the code: they know it breaks when they try to run it faster. It’s clear from reading the code that in 2014 Imperial tried to make the code use multiple CPUs to speed it up, but never made it work reliably. This sort of programming is known to be difficult and usually requires senior, experienced engineers to get good results. Results that randomly change from run to run are a common consequence of thread-safety bugs. More colloquially, these are known as “Heisenbugs“.
But Edinburgh came back and reported that – even in single-threaded mode – they still see the problem. So Imperial’s understanding of the issue is wrong. Finally, Imperial admit there’s a bug by referencing a code change they’ve made that fixes it. The explanation given is “It looks like historically the second pair of seeds had been used at this point, to make the runs identical regardless of how the network was made, but that this had been changed when seed-resetting was implemented”. In other words, in the process of changing the model they made it non-replicable and never noticed.
Why didn’t they notice? Because their code is so deeply riddled with similar bugs and they struggled so much to fix them that they got into the habit of simply averaging the results of multiple runs to cover it up… and eventually this behaviour became normalised within the team.
In issue #30, someone reports that the model produces different outputs depending on what kind of computer it’s run on (regardless of the number of CPUs). Again, the explanation is that although this new problem “will just add to the issues” … “This isn’t a problem running the model in full as it is stochastic anyway”.
Although the academic on those threads isn’t Neil Ferguson, he is well aware that the code is filled with bugs that create random results. In change #107 he authored he comments: “It includes fixes to InitModel to ensure deterministic runs with holidays enabled”. In change #158 he describes the change only as “A lot of small changes, some critical to determinacy”.
Imperial are trying to have their cake and eat it. Reports of random results are dismissed with responses like “that’s not a problem, just run it a lot of times and take the average”, but at the same time, they’re fixing such bugs when they find them. They know their code can’t withstand scrutiny, so they hid it until professionals had a chance to fix it, but the damage from over a decade of amateur hobby programming is so extensive that even Microsoft were unable to make it run right.
No tests. In the discussion of the fix for the first bug, Imperial state the code used to be deterministic in that place but they broke it without noticing when changing the code.
Regressions like that are common when working on a complex piece of software, which is why industrial software-engineering teams write automated regression tests. These are programs that run the program with varying inputs and then check the outputs are what’s expected. Every proposed change is run against every test and if any tests fail, the change may not be made.
The Imperial code doesn’t seem to have working regression tests. They tried, but the extent of the random behaviour in their code left them defeated. On 4th April they said: “However, we haven’t had the time to work out a scalable and maintainable way of running the regression test in a way that allows a small amount of variation, but doesn’t let the figures drift over time.”
Beyond the apparently unsalvageable nature of this specific codebase, testing model predictions faces a fundamental problem, in that the authors don’t know what the “correct” answer is until long after the fact, and by then the code has changed again anyway, thus changing the set of bugs in it. So it’s unclear what regression tests really mean for models like this – even if they had some that worked.
Undocumented equations. Much of the code consists of formulas for which no purpose is given. John Carmack (a legendary video-game programmer) surmised that some of the code might have been automatically translated from FORTRAN some years ago.
For example, on line 510 of SetupModel.cpp there is a loop over all the “places” the simulation knows about. This code appears to be trying to calculate R0 for “places”. Hotels are excluded during this pass, without explanation.
This bit of code highlights an issue Caswell Bligh has discussed in your site’s comments: R0 isn’t a real characteristic of the virus. R0 is both an input to and an output of these models, and is routinely adjusted for different environments and situations. Models that consume their own outputs as inputs is problem well known to the private sector – it can lead to rapid divergence and incorrect prediction. There’s a discussion of this problem in section 2.2 of the Google paper, “Machine learning: the high interest credit card of technical debt“.
Continuing development. Despite being aware of the severe problems in their code that they “haven’t had time” to fix, the Imperial team continue to add new features; for instance, the model attempts to simulate the impact of digital contact tracing apps.
Adding new features to a codebase with this many quality problems will just compound them and make them worse. If I saw this in a company I was consulting for I’d immediately advise them to halt new feature development until thorough regression testing was in place and code quality had been improved.
Conclusions. All papers based on this code should be retracted immediately. Imperial’s modelling efforts should be reset with a new team that isn’t under Professor Ferguson, and which has a commitment to replicable results with published code from day one.
On a personal level, I’d go further and suggest that all academic epidemiology be defunded. This sort of work is best done by the insurance sector. Insurers employ modellers and data scientists, but also employ managers whose job is to decide whether a model is accurate enough for real world usage and professional software engineers to ensure model software is properly tested, understandable and so on. Academic efforts don’t have these people, and the results speak for themselves.
My identity. Sue Denim isn’t a real person (read it out). I’ve chosen to remain anonymous partly because of the intense fighting that surrounds lockdown, but there’s also a deeper reason. This situation has come about due to rampant credentialism and I’m tired of it. As the widespread dismay by programmers demonstrates, if anyone in SAGE or the Government had shown the code to a working software engineer they happened to know, alarm bells would have been rung immediately. Instead, the Government is dominated by academics who apparently felt unable to question anything done by a fellow professor. Meanwhile, average citizens like myself are told we should never question “expertise”. Although I’ve proven my Google employment to Toby, this mentality is damaging and needs to end: please, evaluate the claims I’ve made for yourself, or ask a programmer you know and trust to evaluate them for you.
To join in with the discussion please make a donation to The Daily Sceptic.
Profanity and abuse will be removed and may lead to a permanent ban.
Devastating. Heads must roll for this, and fundamental changes be made to the way government relates to academics and the standards expected of researchers. Imperial College should be ashamed of themselves.
The UK government should be just as ashamed for taking their advice.
And anyone in the media who repeated their nonsense.
But the paper never explicitly recommended full lockdown. School closures, yes. Case isolation and social distancing, yep. But it doesn’t say anything about not going to work, not exercising frequently or travelling. Nor does it say anything about well people remaining in their homes and only being allowed to leave them with a “reasonable excuse”…
Ferguson is on video telling – not suggesting – that millions of people will die if we don’t implement China style lockdowns
Closing schools and going to work ???
A little Home Alone never hurt anyone
Bullshit! A friend’s brother just killed himself because of it…
So says the demented government
Tell that to the over 18,000 additional people who died of heart and circulatory disease in April. https://www.cdc.gov/nchs/nvss/vsrr/covid19/excess_deaths.htm
I brought that up at the time, and was shouted down by the usual suspects. Grandparents would look after the children. But aren’t they the very ones at risk and children are the least likely to get the virus, but would certainly carry it straight to their grandparents.
I’ve written several pieces on the subject of the virus circus.
Now I’m just wanting some people to take responsibility for the hell this house arrest has caused.
Not explicitly, true maybe, but when does any government need more than implication to force it’s sickening, power hungry will upon the general public?
Sure it does. It’s what the paper refers to as “suppression,” rather than “mitigation.”
This is a silly question, but which paper do you mean?
Ah, hindsight….
The problem is the nature of government and politics. Politics is a systematic way of transferring the consequences of inadequate or even reckless decision-making to others without the consent or often even the knowledge of those others. Politics and science are inherently antithetical. Science is about discovering the truth, no matter how inconvenient or unwelcome it may be to particular interested parties. Politics is about accomplishing the goal of interested parties and hiding any truth that would tend to impede that goal. The problem is not that “government has being doing it wrong;” the problem is that government has been doing it.
This article explains how such software should be written. (After the domain experts have reasoned out a correct model and had it verified by open peer review, and if possible by formal methods).
“They Write the Right Stuff” by Charles Fishman, December 1996
https://www.fastcompany.com/28121/they-write-right-stuff
After all, only 7 lives depended directly on the Space Shuttle software. The Imperial College program seems likely to have cost many thousands of extra deaths, and to have seriously damaged the economies and societies of scores of countries, affecting possibly billions of lives.
So why should the resources invested in the two efforts have been so vastly different?
I agree totally. The underfunding of important programs like this feeds into the quality of the resultant model. Concerning Sue Denim’s point, it doesn’t mean that it should become privatised and the work transferred to the insurance sector. As a sector they have large invested interest in a more biased model, at least more than the average fame-hungry epidemiologist researcher. The whole purpose of scientific research is to push the boundaries of understanding, so politicians should be analytical enough to understand limitations of research. It is akin to using a prototype F-35 to go to war, reckless.
You don’t need a lot of funds to review code, they could actually open source it and the community would destroy it for them
You’ve misread as that is not my point. I agree that the code should be reviewed and open source. However, it’s more about that the investment of time and resources should be made prior to COVID, not as a posthumous effort
The solution to incompetence and fraud is not to give more money to incompetent frauds.
Code review is not about “destroying” the code, it’s about improving it: not least, the knowledge it’s going to be reviewed improves the code as it’s written …
They term “destroy” in the context of code reviews is used often, and means destroy the credibility of the code – expose its flaws and failures. So, as you say, it is a positive thing.
A key lesson is that government should equip themselves with capacity to critically appraise risk of bias in scientists’ work. What strikes me is that professors in epidemiology and public health believe that such models are worth presenting to policy makers. WHO in its guideline for non pharmaceutical interventions against influenza grades mathematical models as very low level of evidence.
But so many virologists were prepared to stand up against Imperial College and were silenced. That’s the real issue to be dealt with. Scientists who have worked in the field of respiratory diseases were not asked to talk to Government or go on advisory committees. It was the pseudo scientists who were paid by Pfizer and Bill Gates whose advice was sort. Men and women who already had an agenda and an investment in vaccines etc.
I agree with the sentiment, but this is not science, and it’s only important because government officials were led to believe that it was science.
The entire notion of a ‘social science’ is the biggest intellectual fraud in human history and is only made possible by academics who exploited the hard earned credibility of the physical sciences to elevate the status of their own fields.
It’s nothing to do with funding. In fact the funding that Bill Gates has ploughed into Imperial College means he can call the shots.
however, it has everything to do with Governments being in bed with Bankers and Globalists. A real patriotic Government would look to the safety of the country, its people and its economy. It would also look at the figures for deaths on flu viruses over the last 50 years against real diseases like Ebola. Then they would talk to top virologists across the board. Nothing as sensible as this ever happens because our Government is run by Globalists.
That’s the other side of the horse. You need more granularity.
But they wont. Everyone involved in this now has skin in the game to ensure NOTHING happens and the lockdown carries on as if its the only thing keeping the entire country from dying.
Well, this is exactly why there is a growing movement in academia at grassroots level to campaign for groups to use proper software practices (version control, automated testing and so on).
Vital to factor in Britain’s endemic corruption before seeking head-roll redress. There is none.
I speak from experience. Case study: https://spoilpartygames.co.uk/?page_id=4454
It isn’t devastating at all.
No. The issue with this analysis is that it attempts to discredit the Imperial code. It does not say that lockdown should not have taken place. It does not propose an alternative model that says a different course of action should be followed. It is reasonable to state that lockdown was the right approach given available data and models – this article does nothing to objectively state a different course of action would have led to different results. Taking an approach of risk mitigation, I.e. lockdown, is the sensible approach given the output (including variances due to the core and any bugs). Interested if there is any view to substantiate a different path.
There was no available reliable data:
‘From Jan 15 to March 3, 2020, seven versions of the case definition for COVID-19 were issued by the National Health Commission in China. We estimated that when the case definitions were changed, the proportion of infections being detected as cases increased by 7·1 times (95% credible interval [CrI] 4·8–10·9) from version 1 to 2, 2·8 times (1·9–4·2) from version 2 to 4, and 4·2 times (2·6–7·3) from version 4 to 5. If the fifth version of the case definition had been applied throughout the outbreak with sufficient testing capacity, we estimated that by Feb 20, 2020, there would have been 232 000 (95% CrI 161 000–359 000) confirmed cases in China as opposed to the 55 508 confirmed cases reported.’
https://www.thelancet.com/journals/lanpub/article/PIIS2468-2667(20)30089-X/fulltext
So the use of a model, any model, in preference to, say, canvassing the best advice of a panel of epidemiologists with many years of experience of coronaviruses was, at best, ill judged.
‘Sunlight will cut the virus ability to grow in half so the half-life will be 2.5 minutes and in the dark it’s about 13m to 20m. Sunlight is really good at killing viruses. That’s why I believe that Australia and the southern hemisphere will not see any great infections rates because they have lots of sunlight and they are in the middle of summer. And Wuhan and Beijing is still cold which is why there’s high infection rates.’
‘With SARS, in 6 months the virus was all gone and it pretty much never came back. SARS pretty much found a sweet spot of the perfect environment to develop and hasn’t come back. So no pharmaceutical company will spend millions and millions to develop a vaccine for something which may never come back. It’s Hollywood to think that vaccines will save the world. The social conditions are what will control the virus – the cleaning of hands, isolating sick people etc…’
https://www.fwdeveryone.com/t/puzmZFQGRTiiquwLa6tT-g/conference-call-coronavirus-expert
Professor John Nicholls, Coronavirus expert, University of Hong Kong
Deaths from Covid 19 in Hong Kong? Four, exactly…….
If I remember correctly with SARS, they reduced the effort early-on in Toronto, perhaps from business pressure, and the thing came back. One of the areas needing some thought is dispersal via sewage treatment.
He/she would need a model of his/her own, and a much better one, to analyse results and compute whether this or another course of action was best. Do you suppose there is a better model that is, for some reason, not being used? It is of course not certain that a faulty model would produce the correct answer – even a stopped watch is right twice a day – but it is quite likely.
Sorry, missed a negative there. Not produce
The analysis doesn’t just “attempt[] to discredit the Imperial code.” It does so successfully.
And we now know that the Imperial model’s projections do not match the real outcomes.
Vast social and economic changes have been forced on the populace as a result of bad modeling and unreliable data.
It is emphatically not incumbent on critics of the models, the data gathering, or the lockdown regime to put forward their own models or data, let alone some alternative set of response measures.
“It does so successfully.”
Not necessarily. The article presupposes that the code should stand up to the sort of tests commercial software engineers use when creating distributable software for general consumption. This is not the point of Ferguson’s code.
Statistical models tend to generate their results by being run thousands of times using different starting values. This produces a set of vectors of output values which, when plotted on a linear chart, show a convergence around a given set of values. It is the converged values which are used to predict the distribution they are trying to model, and so provided that Imperial College knew about these flaws (which they say they do – that they are fixing them may be a PR exercise) then it shouldn’t really matter.
I find it pretty incomprehensible that the government isn’t being more alert to the criticisms of Ferguson’s model – particularly given how wrong it has been in the past – but I am lead to believe that this is more due to its assumptions rather than any particular issue with the code. I have not heard of anyone else implementing the mathematical model in a different program and getting different results. That would be genuine evidence of a serious problem with the simulation.
The article’s author bases his criticism on the presence of bugs and randomness. All software has bugs, the question is “does the bug materially reduce its utility?” I have not seen the code so this criticism is about the authors assumptions. In an epidemiological model, randomness is a feature, not a bug. The disease follows vectors probabilistically, not deterministically. This isn’t an email program or a database application, if the model always returned the same output for the same input that would be a bug. Prescribing deterministic behavior may prevent discovery of non-linear disease effects.
If the nondeterministic effects result in prediction variances the same order of magnitude as the predictions themselves, there is a fundamental problem that simply cannot be hand-waved out of.
Indeterminism is an essential feature of stochastic modelling, but the outputs of successive model runs ought to converge to form a roughly similar picture if they are to be useful. If they are wildly divergent as a result of the way the program was written, which is the case here, then there is most certainly an issue which needs to be corrected.
If the model is flawed at its core then it settling on a particular converged set of values after thousands of runs, lends no more credence to its accuracy than a single run. Given the real world data the model seems flawed at its core.
“Statistical models tend to generate their results by being run thousands of times using different starting values.”
Yes, but the idea of having different starting values is that when you run a model twice with the same starting values, it is supposed to give the same result each time.
The Imperial model cannot do that, which means it is not and cannot be correct.
There are several possible problem areas. Subtraction/multiplication/division of small floating point values is a beginner mistake, and yet the code quality sounds so bad that I bet there are some of those as well.
“It is reasonable to state that lockdown was the right approach given available data and models”
What evidence are you thinking of that demostrates that lockdown worked in the past and therefore make it a good policy for this pandemic? From my understanding this is the first such blanket lockdown. We know that self-isolation works for individuals but are you extrapolating from individuals to the whole population?
Your thinking reminds me of people that think that because washing your hands prevents the spread of disease, it must therefore be good to keep your baby in a clean environment: makes sense, logically it all hangs together. Unfortunately it is also a bad assumption because immunity does not work that way. Babies exposed to more germs are generally healthier in the long term: doesn’t make sense, but there you go. That is the advantage of science – real science involves reproducible outcomes and experimentation that is sometime surprising, it does not use heavily flawed models skewed by bad assumptions.
‘Taking an approach of risk mitigation, I.e. lockdown, is the sensible approach’. . . . Lockdown attempts to mitigate one risk–spread of infection–while introducing numerous others, none of which are modeled. The world is far more complex than mathematical-modeling infectious disease epidemiologists seem to realise. This was not a sensible approach, which is why it is an approach that has never been taken for any pandemic in the history of the world prior to this.
JHaywood, that isn’t an “issue”; it’s merely a single point of fact. That the code is a mess and produces muddled results is only one piece of the puzzle. Another very important point is that even the best modeling done with the best code is only as good as the data entered into it. The “data” used to create this model were largely untested assumptions.
It’s fair enough to state that for the first couple of weeks, that’s the best we had. But sound thinkers would have realized the assumptions were assumptions and observed and collected data to TEST them and adjust as necessary. It took weeks to get anyone to REALLY look at most of them, and one by one, we’re seeing the assumptions proven false. There’s no excuse for having not sought answers to these questions — which “mere laypeople” were raising as early as January — much sooner.
Quite honestly Imperial College should never have been used. Ferguson should have been sacked for his previous disasters and the codes they use should have been scrapped years ago. Everyone is barking up the wrong tree. The Swedish Doctor who advised his country to mistrust the code and stay open understood that Ferguson was modelling to a certain outcome which would create wealth for Big Pharma. He understood that Pfizer is under a prosecution for $billions regarding the disastrous vaccine programme in India.
Our government must have known this too. Oxford University said the modelling was way off at the very beginning of this debacle. Yet the Government cosied up to Bill Gates, GAVI and the WHO. TRUMP called all these people out but Boris gave them £millions more of our taxes. These are where the questions should be focussed.
Yes there is a different path, which is to treat COVID 19 as a normal disease just like any other illness. Sweden has shown how the rest of the world should have coped up with this issue. There was unnecessary hype created all over the world just for a simple cold and cough. I mean seriously, who says we are more advanced than before ? We are still living in the stone age, being driven by fear rather than science or logic. Science refutes to call SARS COV-2 a deadly virus, there are no peer reviewed reports till date which claim that this virus is indeed capable of inflicting serious damage to otherwise healthy people. It is just getting tagged as the cause of death, even when the actual cause are the co-morbidities and co-infections.
The past two months have shown that no matter how much pride we take in us being scientifically advanced, in the end when put to a real testing situation we still suffer pathetically as before.
Imperial College coding and Ferguson have been discredited under swine flu, foot and mouth and bird flu. Each time there has been an outcry as to how wrong the code has been and how many animals unnecessarily destroyed and farms gone into liquidation. Such short memories some people have.
heads need to role not only at imperial but at the government as well. this is total incompetence and who is going to accept responsibility for the sheer destruction of the economy and those who have been made redundant. why was the parliament and government not aware of the previous modelling problems associated with this same professor re the mad cow disease when 6m cattle were slaughtered for no reason at all. why was his history not checked?
It’s not incompetence its greed. You cannot tell me that 650 MPs don’t have the intelligence to ask sensible searching questions. Not one asked any questions u til Desmond Swayne and Charles Walker were interviewed by alternative media and shown to be highly ignorant of the facts. These two then started to question. Not one other MP did.
this is competence at the highest most treacherous level and once again Tony Blair and his Globalists are behind it. Follow the money. Who got very much richer. Who got to call the shots around the world. Who destroyed the careers of the real scientists……
“all models are wrong, some are useful” – if govt hadn’t acted on this it would have been far, far worse, so does it matter? At least they did the right thing as a result… Very easy to pick fault with no better solution..
I don’t think Box had models with MAPE approaching 1000% in mind when he surmised that some are useful. You could obtain more accurate predictions by asking a few random people on the street than by relying on the output of Ferguson’s models.
Can you providethe evidence that supports your statement?
No, it’s what should have happened under Peer review, but this is belatedly being applied.
Some public-spirited large company such as Google or Microsoft (don’t laugh !), should offer to modularise the model so it’s more easily a. Maintained b. Re-used c. Tested d. Updated
The first thought that springs to my mind is that, irrespective of the coding, hundreds of thousands have died, world wide, from a single cause attributed to this virus.
That, surely, is fairly potent evidence that a virus, that also came within measurable distance of killing the English Prime Minister, and HAS killed countless numbers in this country alone, has been accurately identified as a virus with lethal properties?
Professor ‘Fergason’s coding might have been out, but the virus is, potentially, and actually, a killer, and highly infectious.
Surely that justifies government strategy?
Hundreds of thousands die each year from the influenza virus. Since when has shutting down entire societies and economies been the approach to the flu?
“Attribution” does not equal “causation” in the same way that “anecdotes” do not equal “evidence”.
“Professor ‘Fergason’s [sic] coding might have been out,……”, it was so far out [WRONG] as to be laughable. It is NOT FIT FOR PURPOSE!
Remember that many governments have DEEMED that if you test positive for SARS-CoV-2 then you are counted in the regardless of comorbidities.
The US CDC updated their Covid deaths 9/12/20. They had the decency to break it down by co-morbidity and even reported ICD-10 codes. Turns out, that out of 174,470 total deaths they included almost 20,000 deaths from dementia, 6000+ deaths from Alzheimer’s, almost 6000 deaths from suicide/unintentional injury including vehicle accidents, over 8000 deaths from cancer, and the single biggest category of deaths is from “all other conditions”, over 85,000 deaths. Their ICD-10 codes show they report from maternal death during childbirth, child death from childbirth including congenital deformities, metabolic diseases, psychiatric disorders, dermatitis and non-cancerous skin disorders, and a host of others.
All these deaths, by the way, are still from a population on average 80 years of age, and overwhelmingly from nursing homes/care facilities. For background, the US sees ~250,000 deaths every month from all causes. Still alarmed?
source: https://www.cdc.gov/nchs/nvss/vsrr/covid_weekly/index.htm
And I forgot to include, because I don’t believe this is widespread knowledge, the CDC does not require a positive laboratory test in order to count a person as a positive Covid case. Their case definition as of early April included probable cases, whereby you can report even a single symptom to a doctor (who for months were assessing patients remotely by videoconferencing) to satisfy the clinical component, which together with the epidemiological component will be enough to report a “positive case” to public health. The epidemiological component is satisfied by: exposure to positive cases, exposure to untested but symptomatic individuals who themselves had positive exposure, travel/residence in an area with Covid outbreak, or even simply being a member of a risk group. The CDC also says if Covid is listed as a factor on a death certificate, that alone satisfies the vital records component and will add both a positive death and a positive new case to the public record.
Since August the definition of Covid includes a third category, “suspect cases” with regards to antigen/antibody testing, a positive test is considered “supportive evidence” by CDC.
source: https://wwwn.cdc.gov/nndss/conditions/coronavirus-disease-2019-covid-19/case-definition/2020/08/05/
Agreed. Expert opinion is only as valuable as the reasoning which produces it. What mattters for decision makers is the logic and assumptions which underlie the experts conclusion. The advice that follows a conclusion also needs to be examined for logical flaws. The cult of the expert has allowed the development of extremely sloppy thinking both in the expert field and the decision makers field.
Both the advice and conclusions drawn from that advice must be examined for logical flaws.
Thank you so much for this! This code should’ve been available from the outset.
Amateur Hour all round!
The code should have been made available to all other Profs & top Coders & Data Scientists & Bio-Statisticians to PEER Review BEFORE the UK and USA Gvts made their decisions. Imperial should be sued for such amateur work.
Guy at carnival: Here, drink this
Some ol’bloke : What is it?
Guy at carnival: Never mind, it will fix what’s ailing ya
Some ol’bloke : What’s it cost?
Guy at carnival: It doesn’t matter, it’s a deal at twice the price
Some ol’bloke : What’s in it?
Guy at carnival: Shhhhh, just take 3 swigs
Some ol’bloke : It tastes horrible
Guy at carnival: Ya, but it will help you
Some ol’bloke : …if you say so
Guy at carnival: I know hey, but you feel better already
But “This code” isn’t what Ferguson was running. The code on github has been munged by other authors in attempt to make it less horrifying. We must remember that what he ran was much worse than what we can see, which is bad enough.
This code is IMPROVED (and cleaned, a lot, by professional software engineers) version of code which was run by Ferguson & Co. It’s still a steaming pile of crap. Ferguson refuses to release the original code, if you haven’t noticed. One is left to wonder why.
But “This code” isn’t what Ferguson was running. The code on github has been munged by other authors in attempt to make it less horrifying. We must remember that what he ran was much worse than what we can see, which is bad enough.
This is an outstanding investigation. Many thanks for doing it – and to Toby for providing a place to publish it.
So this is ‘the science’ that the Government thinks is that it is following!
*the Government reminds us*
Says who ?
This is isn’t a piece of poor software for a computer game, it is, apparently, the useless software that has shut down the entire western economy. Not only will it have wasted staggeringly vast sums of money but every day we are hearing of the lives that will be lost as a result.
We are today learning of 1.4 million avoidable deaths from TB but that is nothing compared to the UN’s own forecast of “famine on a biblical scale”. Does one think that the odious, inept, morally bankrupt hypocrite, Ferguson will feel any shame, sorrow or remorse if, heaven forbid, the news in a couple of months time is dominated by the deaths of hundreds of thousands of children from starvation in the 3rd World or will his hubris protect him?
I don’t understand why governments are still going for this ridiculous policy and NGOs all pretend it is Covid 19 that will cause this devastation RATHER than our reaction to it.
It’s the same with the myriad of climate change campaigners. It’s their climate change *policies* that are dangerous, not climate change itself (whatever ‘climate change’ means!).
Simple – they are afraid to say that they have made a mistake. And, people who follow this are afraid, as per The Emperor’s New Clothes, to admit that they are being used as gullible fools.
Impperial and the Professor should start to worry about claims for losses incurred as a result of decisions taken based on such a poor effort. Could we know, please, what this has cost over how many years and how much of the Professor’s career has been achieved on the back of it.
Remember that Ferguson has a track record of failure:
in 2002 he predicted 50,000 people would die of BSE. Actual number: 178 (national CJD research and survellance team)
In 2005 he predicted 200 million people would die of avian flu H5N1. Actual number according to the WHO: 78
In 2009 he predicted that swine flu H1N1 would kill 65,000 people. Actual number 457.
In 2020 he predicted 500,000 Britons would die from Covid-19.
Still employed by the government. Maybe 5th time lucky?
Maybe but he’ll have to step up his game.
The figure of 500,000 deaths was based on the government’s ‘do nothing, business as usual to achieve herd immunity’ strategy then in effect. Ferguson predicted 250,000 deaths if the government acted as it has done since.
Yeah… way more people died of BSE than 178…
Source?
Actually he didn’t. The model said if no action was taken up to 500,000 people could die. Please weigh in objectively to support or challenge the theory above.
Do you mean just in the UK? Because swine flu killed way more than 457 in the parts of the world where they didn’t vaccinate.
Ferguson should be retired and his team disbanded. As a former software professional I am horrified at the state of the code explained here. But then, the University of East Anglia code for modelling climate change was just as bad. Academics and programming don’t go together.
At the very least the Government should have commissioned a Red team vs Blue team debate between Ferguson and Oxford plus other interested parties, with full disclosure of source code and inputs.
I support the idea of letting the Insurance industry do the modelling. They are the experts in this field.
The software is irrelevant : a convenient peg to hang a global action on for reasons I cannot divine at present but which will become clearer
Ferguson and Oxford are the same team. If you look at the authors of the Ferguson papers you’ll find Oxford names there. If you look at the authors of papers from John Edmunds group you’ll find people who hold posts at Imperial. These groups are not independent.
I read that Ferguson has a house in Oxford.
There was a RANGE from the MODEL, not a PREDICTION. From a 2002 report by the Guardian (https://www.theguardian.com/education/2002/jan/09/research.highereducation)
“The Imperial College team predicted that the future number of deaths from Creutzfeldt-Jakob disease (vCJD) due to exposure to BSE in beef was likely to lie between 50 and 50,000.
In the “worst case” scenario of a growing sheep epidemic, the range of future numbers of death increased to between 110 and 150,000. Other more optimistic scenarios had little impact on the figures.
The latest figures from the Department of Health, dated January 7, show that a total of 113 definite and probable cases of vCJD have been recorded since the disease first emerged in 1995. Nine of these victims are still alive.”
““The Imperial College team predicted that the future number of deaths from Creutzfeldt-Jakob disease (vCJD) due to exposure to BSE in beef was likely to lie between 50 and 50,000…..” That’s three orders of magnitude for the margin of error!! What other science would accept such a wide margin of error?
“The latest figures from the Department of Health, dated January 7, show that a total of 113 definite and probable cases of vCJD have been recorded since the disease first emerged in 1995. Nine of these victims are still alive.”” So strictly speaking the Imperial College was correct, thankfully the reality was within their lowest estimate.
Pathetic review. You should go through the logic of what is coded and not write superficial criticisms which implies you know nothing of what you critique.
If only the code could actually be understood. It’s so bad you can’t even be certain of what exactly it’s doing.
Pretty sure the only point of the article was to bring light to the fact that the “model” is flawed and Ferguson has a track record of being VERY wrong on mortality rate predictions based upon flawed models. Solution, stop it. This time around it almost took down an entire country’s economy because of elitist’s overreaction and overreach. Just stop it.
‘almost’ took down an entire country’s economy”
they haven’t stopped the Lockdown yet, plenty of time yet to destroy small businesses.
I couldn’t disagree more. The issue isn’t the virology, or the immunology, or even the behaviour of whatever disease is being examined / simulated. It is the programming discipline applied to the modelling effort. I doubt the author has the domain-specific expertise to comment on the immunological (etc) assumptions embedded in the program. What the author does have is the programming expertise to identify that the model could not produce useful output, no matter how accurate the virology / immunology assumptions, because the software that translated those assumptions into predictions of infections and case loads was so poorly written.
I’m afraid Ferguson is a very small part of the plan, and merely doing what he was hired for by KillBill.
It’s inappropriately UK-centric to speak of “the useless software that has shut down the entire western economy”. All governments have scientific advisors, there’s lots of modelling going on in many countries, and much of this influenced the lockdown decisions all over the world. If I remember it correctly, when Italy started its lockdown, Imperial hadn’t ye made their recommendation, and many if not most countries have not relied on Imperial. The software may be garbage, but the belief that there wouldn’t be strong scientific arguments for a lockdown without that piece is nonsense as well.
Thank goodness I’ve encountered a small injection of level-headedness here. The original critique is limited, appropriately, to the flawed coding and reliance on its outputs to inform UK policy; it draws none of the sweeping conclusions that others here seem to think are implied – perhaps owing to their own biases. (I came to this site thinking it was named for skeptics who are in lockdown, before I realised it was for skeptics *of* lockdown – so, yeah, plenty of motivated reasoning and politically-charged statements masquerading as incontrovertible truths, but hey, I’m just an actual skeptic… )
So anyway, I’m glad that Lewian has pointed out, because somehow it needed to be, that the world is bigger than the UK and that science (note: not code or software or politics or a dude called Neil) does not operate in a vacuum. One needn’t input even a single data point into a single model in order to undertake risk mitigation strategies if you (and by you, I mean the relevant scientific minds, not actually you) have even a comparatively rudimentary understanding of an infectious agent such as Covid-19. It’s simple cause and effect, extrapolated.
Want to be a skeptic in the classical tradition? Listen to the best available science from the most experienced scientists and researchers in virology, infectious disease, public health and epidemiology. Rely on their collective expertise to inform your own positions, because their baseline knowledge and understanding of the variables at play is granular and complex and anchored in decades of science and scientific research and informed by the fluid facts on the ground.
It seems to have been the primary determinant here in the US, too (and, I think, in Canada). Or at least that’s what they’re telling us.
I think you are all missing the point. The use of the model was to impress upon the U.S. President the severity of the outbreak. He needed more than “the best available science from the most experienced scientists and researchers in virology, infectious disease, public health and epidemiology” to take it seriously. Clearly, they had done their own modeling and assessment. This thread is what happens when you lose yourself in the code and aren’t looking at the big picture.
Why any of this isn’t obvious to our politicians says a lot about our politicians, but your summary also shows that that it is ENGINEERs and not academics that should be generating the input to policy making. It is only engineers who have the discipline to make things work, properly and reliably.
For decades I have opined that our society was exposed to the risk inherent in being a technologically dependent culture governed by the technically illiterate. QED?
“The Chinese Government Is Dominated by Scientists and Engineers”
https://gineersnow.com/leadership/chinese-government-dominated-scientists-engineers
They are also communists. Which is another way of saying “psychopathic liars”.
No, scientists can write perfectly good code if they have the incentive to do so. Heck, most of the really important math codebases have been written by scientists. But the problem is, most scientists have the incentive to publish quickly, butnot that their methods follow good engineering practice, even when it should be mandated. This has bitten the climatologists in the butt with the so-called “climategate”. Congressional enquiries showed that their integrity was intact and that their methods were sound and followed standard scientific practice. But they lacked transparency, and therefore it was recommended that they should from now on make public all their numerical code and all their data. This has become widespread practice in climatology. Unfortunately, that still isn’t the case in other branches of science. It should be.
A good point, but should you not add two other categories to the statement? First, civil servants; unlike the politicians, these are employed to use their expertise in advising politicians. They tend to be recruited by other civil servants, rather than the polticians.
The second group is journalists. I have seen no mention of this kind of criticism aired publicly by journalists. Indeed, this touches on another of my gripes; in the almost never-ending press conferences, current affairs programmes and interviews, the same old questions are asked over and over again, to be answered by the same generalised statements, while the more interesting and detailed matters are omitted, or, in a tiny number of occasions, interrupted or run out of time.
This kind of thing frequently happens with academic research. I’m a statistician and I hate working with academics for exactly this sort of reason.
the global warming models are secret too (mostly) and probably the same kind of mess as this code
Perhaps, if enough people come to understand how badly this has been managed, they will start to ask the same questions of the climate scientists and demand to see their models published.
It could be the start of some clearer reasoning on the whole subject, before we spend the trillions that are being demanded to avert or mitigate events that may never happen.
These so called Climate scientists were asked to provide the data, but they come back and said they lost the data when they moved offices.
Michael Mann pointedly refused to share his modelling code for climate change when he was sued for libel in a Canadian court. Ended up losing that will cost him millions. Now why would an academic rather lose millions of dollars than show their working.
Lets hope this “workings not required” doesn’t get picked up by schoolkids taking their exams
Tried to find something about this on the BBC news site. Found this:
https://www.bbc.com/news/uk-politics-52553229
At the end of the article, there is “analysis” from a BBC health correspondent.
With such pitiful performance from the national broadcaster, I think Ferguson and his team will face no consequences.
LOL wat a load of crap, it’s the other way around: it’s Mann who sued.
“In 2011 the Frontier Centre for Public Policy think tank interviewed Tim Ball and published his allegations about Mann and the CRU email controversy. Mann promptly sued for defamation[61] against Ball, the Frontier Centre and its interviewer.[62] In June 2019 the Frontier Centre apologized for publishing, on its website and in letters, “untrue and disparaging accusations which impugned the character of Dr. Mann”. It said that Mann had “graciously accepted our apology and retraction”.[63] This did not settle Mann’s claims against Ball, who remained a defendant.[64] On March 21, 2019, Ball applied to the court to dismiss the action for delay; this request was granted at a hearing on August 22, 2019, and court costs were awarded to Ball. The actual defamation claims were not judged, but instead the case was dismissed due to delay, for which Mann and his legal team were held responsible”
Yes, Mann brought the case; on the other hand, it’s also correct that the case was dismissed when he didn’t produce his code. 9 years after the case started. The step that caused the enventual dismissal of the case was that Mann applied for an adjournment, and the defendents agreed on the condition that he supplied his code. Mann didn’t do that by the deadline specified, and the case was then dismissed for delay. Mann did say he would appeal.
The take-home point is that even though Dr. Mann sued for defamation, he incongruously refused to provide evidence that the supposed defamation was actually false, something he could easily have done.
If I were publicly defamed as a liar, I would wish for my name to be cleared immediately, and the falsehood shown definitively to be untrue. Dr. Mann stonewalled for more than nine years, refusing to provide the evidence which supposedly should have cleared his good name, which suggests that he was using the legal process as a weapon, rather than trying to purge a slur on his character.
It was worse than that. Dr. Mann took the libel lawsuit against Dr. Timothy Ball, a retiree. Dr. Ball made a truth defence, which is acceptable in Canadian common law, and requested that the plaintiff, Dr. Mann, provide the code and data on which he based his conclusions. Dr. Mann stalled for a decade until at Dr. Ball’s request to expedite the case due to his age and ill health, the judge threw out the suit.
Tl;dr Dr. Mann sued for libel, but refused to provide evidence that the supposed libel was, in fact, false. It appears he was hoping for Dr. Ball would run out of money and fold.
Not really, they aren’t. But they are indeed garbage. For example you may download the code for GISS GCM ModelE from here: https://www.giss.nasa.gov/tools/modelE/
No. Quite the opposite. This has bitten the climatologists in the butt with the so-called “climategate”. Congressional enquiries showed that their integrity was intact and that their methods were sound and followed standard scientific practice. But they lacked transparency, and therefore it was recommended that they should from now on make public all their numerical code and all their data. This has become widespread practice in climatology.
In fact there is a guide of practice for climatologists:
https://library.wmo.int/doc_num.php?explnum_id=5541
The so-called ‘hockey-team’ were not cleared by the series of inquiries following the release of the ‘climategate’ emails. In fact, the inquiries seemed designed to avoid the serious issues raised by the email dump.
https://www.rossmckitrick.com/uploads/4/8/0/8/4808045/climategate.10yearsafter.pdf
It raises the questions (a) what other academic models that have driven public policy have such bad quality?, and (b) do the climate models suffer in the same way, also making them untrustworthy?
Similar skeptical attention should be paid to the credibility automatically granted to economic model projections – even for decades ahead. Economic estimates are routinely treated as facts by the biggest U.S. newspaper and TV networks, particularly if the estimates are (1) from the Federal Reserve or Congressional Budget Office, and (2) useful as a lobbying tool to some politically-influential interest group.
Academics are paid peanuts in the UK. It’s not the US with their 6 figure salaries. You need to teach 8+ hours, do your adminitrivia, and perhaps you’ll squeeze a couple of hours in for research at the end (or beginning) of a very long day. Nothing like Google, with its 500K salaries, and its code reviews. Sure non-determinism sucks but if the orders of magnitude of results fit expectations from other models, it’s good enough to compete with other papers in the field. Want to change that? Fund intelligent people in academia the way you fund lawyers and bankers. Oh, and managers in private industry will change results if it suits them, so “privatise it” is bollocks.
The problem does not lie with non determinism in the model, but with wild divergence of output.
Just wonderful and sadly utterly devastating. As an IT bod myself and early days skeptic this was such a pleasure to ŕead. Well done
Thanks for doing the analysis. Totally agree that leaving this kind of job to amateur academics is completely non sensical. I like your suggestion of using the insurance industry and if I were PM I would take that up immediately.
Scientists provide the science, insurers provide insurance. I would never go to an academic for insurance. There is an obvious conflict of interest with relying on an insurance company. It has a fiduciary responsibility to share holders and policy making should be entirely separate from the commercial interests of providing health insurance. The purpose of academia, besides providing education, is to pursue R&D in a non-commercial environment where all IP and research products (i.e. papers and codes) are disclosed to the public. Unfortunately, the insurance industry does not work to the same open standard. The industry is plagued by grotesque profiteering and opaque modeling practices – there are few universal standards for modeling. Try getting an insurance company to fully disclose details of its mortality models and provide beautifully curated source code for everyone to reproduce the decisions made by insurance companies when reviewing claims. You will not find one insurance company’s code in the public domain that is representative of production. My experience has been that the insurance industry is on the whole exactly the opposite of what you are proposing – no transparency and is clearly designed to profit on the misfortunes of others. Granted academia has its flaws and has fallen victim to the jaws of capitalism, but it operates first and foremost in the interests of widening the public body of knowledge. You earn a voice by publishing scientific papers in peer reviewed journals and in some domains the results have to be scientifically reproducable and are quickly discredited if they aren’t. You also can’t separate academics from industry practitioners as many move back and forth between industry and you’ll find that in the insurance industry too. Ironically many of the models and math used in the insurance industry is developed by “amateurish academics”.
I am not a big fan of the insurance business, but to be objective:
-Actuarial models in the insurance industry are used to determine insurance pricing, not to settle claims. Claims are based on evidence.
-The insurance business is not designed to profit on the misfortunes of others; a perfect insurance business model outcome would be that there were NO misfortunes. One must also remember that the overwhelming desired outcome of purchasers of insurance is that it not be required to make claims.
-Academic science has not fallen victim to capitalism, it has fallen victim to bureaucracy and conformity; if you do not conform to espouse expected and required outcomes you are labeled as a pariah, demonised and excluded. Evidence contradicting official policy is suppressed, falsified, or rationalised away.
But see Thomas Kuhn’s ‘The Structure of Scientific Revolutions’ which touched on the herd mentality of structured organizations and eventual paradigm shifts. In the example of these pandemic modelling disasters, the paradigm shift would be to exclude modelling as an influence on government policy, and the manias that can result.
-And finally, in science there is usually no accountability, liability, or consequences except temporary. In this most recent marriage of political power and ‘modelling’ catastrophe, the solution has been to just come up with yet another model and to rationalise whatever policy implemented as having been necessary; politicians will rarely if ever admit error of a policy course no matter what the cost, whether lives or money.
Look at SetupModel.ccp from line 2060 – pages of nested conditionals and loops with nary a comment. Nightmare!
The best is there’s all this commented out code. Was it commented out by accident? Was there a reason for it being there to begin with? Who knows, it’s a mystery.
Haven’t time to read the article and stopped at the portion where the data can’t be replicated. That right there is a huuuuuuge red flag and makes the “models” useless. I’ll come back tonight to finish reading. I have to ask: Is this the same with the University of Washington IMHE models?. Why do I have a sneaking suspicion that it is.
The IMHE ‘model’ is much worse – it’s just a simple exercise in curve fitting, with little or no actual modelling happening at all. I have collected screenshots of its predictions (for the US, UK, Italy, Spain, Sweden) every few days over the last few weeks, so I could track them against reality, and it is completely useless. But, according to what I’ve read, the US government trusts it!
Until a few days ago, its curves didn’t even look plausible – for countries on a downward trend (e.g. Italy and Spain), they showed the numbers falling off a cliff and going down to almost zero within days, and for countries still on an upward trend (e.g. the UK and Sweden) they were very pessimistic. However, the figures for the US were strangely optimistic – maybe that’s why the White House liked them.
They seem to have changed their model in the last few days – the curves look more plausible now. However, plausible looking curves mean nothing – any one of us could take the existing data (up to today) and ‘extrapolate’ a curve into the future. So plausibility means nothing – it’s just making stuff up based on pseudo-science. In the UK, we’re not supposed to dissent, because that implies that we don’t want to ‘save lives’ or ‘protect the NHS’, so the pessimistic model wins. In the US, it’s different, depending on people’s politics, so I’m not going to try to analyse that.
So why do governments leap at these pseudo-models with their useless (but plausible-looking) predictions? It’s because they hate not knowing what’s going to happen, so they are willing to believe anyone with academic credentials who claims to have a crystal ball. And, if there are competing crystal balls from different academics, the government will simply pick the one that matches its philosophy best, and claim that it is ‘following the science’.
Ditto. The IMHE predictions are completely silly.
They leap at them for fear of the MSM accusing them of not doing anything.
I had hoped Donald Trump would be a stronger leader than that, and insisted on any model being independently and repeatedly verified before making any decision.
The other factor that seems entirely missing from the models is the ability of existing medicines, even off-label ones, to treat the virus, and there have been many trials of Hydroxy Chloroquine with Zinc sulphate (& some also with Azithromycin) that have demonstrated great success. It constantly dismays me that this is ignored, and here in the UK, patients are just given paracetamol; as if they have a headache!!
I offer a critical review of past and present IHME death projections here: https://www.cato.org/blog/six-models-project-drop-covid-19-deaths-states-open
Could these popularity contest winners perhaps just be idiots? Occam’s razor applies.
“It’s because they hate not knowing what’s going to happen, so they are willing to believe anyone with academic credentials who claims to have a crystal ball.”
Problem with this one is that Neil Fergusson and Imperial College have been consistently wrong.
I’m a guy working in the biz for 40+ years. Just a grunt, but paid pretty well for being a grunt. The “can’t be replicated” is insane.
The only time “can’t be replicated” is an issue when real time is involved. If you can’t say “Ready, set , go” with the same set of data and assumptions that are plugged in, you have some serious issues going on.
“But, we have to multi-thread…..on multiple CPU cores or we won’t get results fast enough”. Ok, you got bogus results.
This is scary stuff. I’ve been a professional developer and researcher in the finance sector for 12 years. My background is Physics PhD. I have seen this sort of single file code structure a lot and it is a minefield for bugs. This can be mitigated to some extent by regression tests but it’s only as good as the number of test scenarios that have been written. Randomness cannot just be dismissed like this. It is difficult to nail down non-determinism but it can be done and requires the developer to adopt some standard practices to lock down the computation path. It sounds like the team have lost control of their codebase and have their heads in the sand. I wouldn’t invest money in a fund that was so shoddily run. The fact that the future of the country depends on such code is a scandal
‘Software volatility’ is the expression Robin and it is always bad.
I have not looked at Neil Ferguson’s model and I’m not interested in doing so. Ferguson has not influenced my thinking in any way and I have reached my own conclusions, on my own. I made my own calculation at the end of January, estimating the likely mortality rate of this virus. I’m not going to tell you the number, but suffice to say that I decided to fly to a different country, stock up on food right, and lock myself up so I have no contact with anybody, right at the beginning of February, when nobody else was stocking up yet, nobody else was locking themselves up, and people thought it was all a bit strange. When I flew to my isolation location, I wore a mask, and everyone thought it was a bit strange. Make your own conclusions.
I’ve read this review.
Firstly, I’ll stress this again, I’m not going to defend Ferguson’s model. I have not seen it. I don’t know what it’s like. I don’t know if it’s any good.
I don’t share Ferguson’s politics, even less so those of his girlfriend.
His estimate of the number that would likely die if we took no public health measures IMO is not an over-estimate. There are EU countries which have conducted tests of large random, unbiased, samples of their population to estimate what percentage of their population has had the virus. The number – in case of those countries – comes out at 2%-3%. If the same is true of the UK, then 30,000 deaths would translate to 1 million deaths if the virus infected everybody. Of course, we don’t know if the same is true of the UK.
But now I am going to criticize this criticism of Ferguson’s model, because it deserves criticism.
I’ve been writing software for 41 years. Including modeling and simulation software. I wrote my first stochastic Monte Carlo simulator 37 years ago. I have written millions of lines of code in dozens of different programming languages. I have designed many mathematical models, including stochastic ones.
Ferguson’s code is 30 years old. This review criticizes it as though it was written today, but many of these criticisms are simply not valid when applied to code that’s 30 years old. It was normal to write code that way 30 years ago. Monolithic code was much more common, especially for programs that were not meant to produce reusable components. Both disk space, RAM, and CPU speeds were not amenable to code being structured to the same extent it is today. Yes, structured programming was known, yes, software libraries were used, but programs like simulation software generally consisted of at most a handful of different source files.
30 years ago, there was no multi-threading, so it was reasonable to write programs on the assumption that they were going to run on a single-threaded CPU. With few exceptions, like people working on Transputers, nobody had access to a multi-threaded computer. I can’t say what is making his code not thread safe, but not being thread safe does not necessarily imply bad coding style, or bad code. There are many functions even in the standard C library which are not thread safe, and some that come in two flavours – thread safe and not thread safe. The thread safe version normally has more overhead and it is less efficient. Today, this may make no difference, but 30 years ago, that mattered. A lot. Writing code which was not thread safe, if you were optimizing for speed, may have made perfect sense.
While not documenting your programs was not great practice even back then, it was also very common, especially for programs which were initially designed for a very specific application, and were not meant to be reused in other projects or libraries. There is nothing particularly unusual about this.
It’s perfectly normal not to want to disclose 30 year old code because, as has been proven by this very review, people will look at it and criticize it as if it was modern code.
So Ferguson evidently rewrote his program to be more consistent with modern coding standards before releasing it. And probably introduced a couple of bugs in the process. Given the fact that the original code was undocumented, old, and that he was under time pressure to produce it in a hurry, it would have been strange if this didn’t introduce some bugs. This does not, per se, invalidate the model. Your review does not give any reason to think these bugs existed in the original code or that they were material.
The review criticizes the code because the model used is stochastic. Which means random, the review goes on explain. Random – surely this must be bad! But stochastic models and Monte Carlo simulation are absolutely standard techniques. They are used by financial institutions, they were used 30 years ago for multi-dimensional numerical integration, they are used everywhere. The very nature of the system being modeled is fundamentally and intrinsically stochastic. Are you saying you have a model which can predict, with certainty, how many dead people there will be 2 weeks from now? No, of course you don’t. This depends on so many variables, most of which are random, and so they have to be modeled as being random. From the way you describe the model (SimCity-like), it sounds like it models individual actors, so it ipso facto it has to be stochastic. How else do you model the actions of many independent individual human actors?
I don’t know the author or anything about her background. But it doesn’t sound to me like she was writing software or making mathematical models 30 years ago, or she wouldn’t be making many of the statements she is making.
Reviewing Ferguson’s model in depth is certainly something that someone ought to do. But a serious review would understand what the (stochastic) model does, explain what it does, and assess the model on its merits. I have no idea whether the model would survive such a review well or be torn to shreds by it. But this review just scratches the surface, and criticizes Ferguson’s software in very superficial ways, largely completely unwarranted. It does not even present the substance of the model.
I read the author’s discussion of the single-thread/multi-thread issue not so much as a criticism but as a rebuttal to possible counter-arguments. I agree it probably should have been left out (or relegated to a footnote), but the rest of the author’s arguments stand independently of the mult-thread issues.
I disagree with your framing of the author’s other criticisms as amounting to criticism of stochastic models. It does not appear the author has an issue with stochastic models, but rather with models where it is impossible to determine whether the variation in outputs is a product of intended pseudo-randomness or whether the variation is a product of unintended variability in the underlying process.
According to Github, the reproductibility bugs mentionned have been corrected by either the Microsoft team or John Carmack, and the software is now fully repoducible. They sure checked what result was given by the software before and after the corrrection and they must have found out it was the same.
The question is, have the bugs led to incorrect simulations ? I can’t say but realistically it’s very unlikely. As a scientist, Neil Ferguson and his team are trained to see errors like that and the fact that they commented at these bugs is evidence enough that they knew they were buggy.
Is it poor software practice ? Absolutely.
Should scientists systematically open source their code and data ? I think so, and I deplore the fact that it’s still not standard practice (except in climatology).
Are the simulations flawed and is it bad science ? You certainly cannot conclude anything even close to that from such a shallow code review.
dr_t,
I am also a Software Engineer with over 35 years of experience, so I understand what you are saying as far as 30 year old code, however if the software is not fit for purpose because it is riddled with bugs, then it should not be used for making policy decisions. And frankly I don’t care how old the code is, if it is poorly written and documented, then it should be thrown out and rewritten, otherwise it is useless.
As a side note, I currently work on a code base that is pure C and close to 30 years old. It is properly composed of manageable sized units and reasonably organized. It also has up to date function specifications and decent regression tests. When this was written, these were probably cutting-edge ideas, but clearly wasn’t unknown. Since then we’ve upgraded to using current tech compilers, source code repositories, and critical peer review of all changes.
So there really is no excuse for using software models that are so deficient. The problem is these academics are ignorant of professional standards in software development and frankly don’t care. I’ve worked with a few over the course of my career and that has been my experience every time.
I agree 100%, I wrote c/c++ code for years and this single file atrocity reminds me of student code
The fact it wasn’t refactored in 30 years is a sin plain and simple.
That’s human nature. I work as a S.E. in financial services. No real degree. Been doing it for 40 years, pays well, can probably work into my 70s if I want. Just got a little project to make a Access Data Base (MDB file) via a small program for a vendor that our clients love and trust. What the ??????? MicroSoft never canceled it but hasn’t promoted it in at least 15 years. I also get projects based on COBOL specs.
That tells me that people are kicking the can down the road because “It still runs. It’ll be fine”. And, they hope they are retired when it’s not fine.
More over, this was likely the ‘code’ used for his swine flu predictions – which performed magnificently
I was coding on a large multi-language and multi-machine project 40 years ago. This was before Jsckson Structured Programming, but we were still required to document, to modularise, and to perform regression testing as well as test for new functionality. These were not new ideas when this model was originally created.
The point of key importance is that code must be useful to the user. This is normally ensured by managers providing feedback from the business and specifying user requirements in better detail as the product develops. And this stage was, of course , missing here.
Instead we had the politicians deferring to the ‘scientists’, who were trying out a predictive model untested against real life. That seems to have worked out about as well as if you had sacked the sales team of a company and let the IT manager run sales simulations on his own according to a theory which had been developed by his mates…
> untested against real life.
And _untestable_? There is no mention in the review of how many parameter values need to be fixed to produce a run. More than 6-10 and I cannot imagine searching for parameters for a best fit [to past data] to result in stable values over time.
All I know is that my son is same as Ferguson. Physics PhD BUT is now a commercial machine learning Data Scientist. However, he has spent five years out of academia learning the additional software skills required, passing all AWS certs etc. Ferguson didn’t.
Yes, I was coding 30 years ago and we wrote modular, commented code using SCCS for version control.
And I know a juggler who can juggle 7 balls while rubbing his belly. He is a juggler, you may be a software developer and Ferguson is an epidemiological modeler. How good are your epidemiological modelling and ball juggling skills?
Working as an Analyst/Programmer together with a Metallurgist and a Production Engineer, I designed and programmed a Production Scheduling system, derived from their expertise.
This was some 35 years ago. Documentation of the system was provided in the terminology of the experts, with links to the documentation of the code – and vice versa.
So, no, I would not claim to have been able to juggle with their 7 balls, but, equally, they could not juggle with mine.
How wrong you will be proved to be. Testing is already indicating that huge numbers of the global population have already caught it. The virus has been in Europe since December at the latest, and as more information comes to light, that date will likely be moved significantly backwards. If the R0 is to be believed, the natural peak would have been hit, with or without lockdown, in March or April. That is what we have seen.
This virus will be proven to be less deadly than a bad strain of influenza, with it without a vaccinated population. Total deaths have only peaked post lockdown. That is not a coincidence.
@Robbo Why is it not a coincidence? I am not sure what to think about this virus: you say it will proven to be like a bad strain of influenza, but I work in a hospital and our clinical staff are saying they have never seen anything like it in terms of number of deaths.
There empty hospitals full of tik tok stars?
I would not be surprised at a large number of initial deaths with a new disease when the medical staff have no protocol for dealing with it. In fact, I understand that their treatment was sub-optimal and could have made things worse.
When we have a treatment for it we will see how dangerous it is compared to flu. Which can certainly kill you if not treated properly…
https://vk.ovg.ox.ac.uk/vk/influenza-flu
“In the UK it is estimated that an average of 600 people a year die from complications of flu. In some years it is estimated that this can rise to over 10,000 deaths (see for example this UK study from 2013, which estimated over 13,000 deaths resulting from flu in 2008-09).”
This thing has already killed 30,000 in NHS hospitals and probably another 15,000 who died at home and in care homes – 45,000 in total. The numbers are only this low because of the draconian lockdown measures.
This is in the space of the first 2 months, and we are nowhere near the saturation point yet. Those countries in the EU which have conducted randomized antibody testing trials have determined that 2%-3% of their populations have been infected to-date.
The Spanish flu killed 220,000 in the UK over a period of 3 years between 1918 and 1920.
We may not know exactly how dangerous this thing is, but we already know that it is nothing like the flu and a heck of a lot more deadly.
What are the deaths of those that have died FROM covid 19 and how are those written on the death certificates and how is it that those that die of a disease other than covid 19 are also included as covid 19 deaths when they were only infected by covid 19. As we know there are asymptomatic carriers so there MUST be deaths were they had the covid but that it was not a factor in those deaths but were included on the death certificate. The numbers of deaths that have been attributable to covid 19 have been over-inflated. Never mind that the test is for a general coronavirus and not specific to covid 19.
Dr T, do you have references to the randomized antibody studies that show 2-3% spread? Some of the studies I’ve seen for the EU indicate higher (e.g., the Gangelt study).
How many of these clinical staff were working during the 1957 pandemic? Probably …. none. It was worse on an absolute and per-capita basis than what we’re seeing now.
The 1957 flu pandemic killed 70,000 in the USA in the space of more than a year. SARS-2 has killed nearly 80,000 in the USA in the first 2 months. I cannot find reliable numbers for the number of deaths in the 1957 pandemic in the UK, but all the partial numbers I can find are a lot lower than the current number of SARS-2 deaths in the UK in the first 2 months (30,000 in the NHS + 15,000 at home and in care homes = 45,000). As all epidemics, the growth in the absence of mitigation measures is exponential until saturation is reached (we are very far from that point), so most of the deaths occur at the peak, and what you see even a week before the peak is a drop in the ocean. I think you need to check the facts before making such claims.
The US didn’t have anywhere near 80K deaths in the first two months. Where are you getting these numbers? And what date are you placing the first US COVID-19 death?
The CDC website shows a total of just under 49K as of May 11:
https://www.cdc.gov/nchs/nvss/vsrr/covid19/index.htm
How many of these clinical staff were working during the 1957 pandemic? Probably …. none. It was worse on an absolute and per-capita basis than what we’re seeing now.
Brilliant comment. This model assumes first infections at least two months too late. The unsuppressed peak was supposed to be mid May (the ‘terrifying’ graph) so what we have seen in April is likely the real peak and lockdown has had no impact on the virus. Lockdown will have killed far more people. Elderly see no point in living in lockdown. Anecdotal reports that people in care homes have just stopped eating.
Nope. Base rate. Look outside your own wee land.
Spain is a better example.
Just model Spain with a simple statistical model and you see the lockdown impact.
It’s easy, you can do it in an afternoon.
> If the R0 is to be believed, the natural peak would have been hit, with or without lockdown, in March or April. That is what we have seen.
That’s what we’ve seen WITH lockdown. We haven’t tried a no-lockdown scenario, so we don’t know in practice when that would have peaked.
> This virus will be proven to be less deadly than a bad strain of influenza
Flu kills around 30,000/year in the US, mostly over a five-month period. Covid-19 has killed 70,000 in about six weeks, despite the lockdown.
@Frank,
“That’s what we’ve seen WITH lockdown. We haven’t tried a no-lockdown scenario, so we don’t know in practice when that would have peaked”.
Incorrect.
Peak deaths in NHS hospitals in England were 874 on 08/04. A week earlier, on 01/04, there were 607 deaths. Crude Rt = 874/607 = 1.4. On average, a patient dying on 08/04 would have been infected c. 17 days earlier on 22/03. So, by 22/03 (before the full lockdown), Rt was (only) approx 1.4.
Ok, so that doesn’t tell us too much, but if we repeat the calculation and go back a further week to 15/03, Rt was approx 2.3. Another week back to 08/03 and it was approximately 4.0.
Propagating forward a week from 22/03, Rt then fell to 0.8 on 29/03
So you can see that Rt fell from 4.0 to 1.4 over the two weeks preceding the full lockdown and then from 1.4 to 0.8 over the following week, pretty much following the same trend regardless.
So, using the data we can see that we could have predicted the peak before the lockdown occurred, simply using the trend of Rt.
In my hypothesis, this was a consequence of limited social distancing (but not full lockdown) and the virus beginning to burn itself out naturally, with very large numbers of asymptomatic infections and a degree of prior immunity.
Peak excess all-cause mortality was last week – yes the last week in April. Don’t just look at reported COVID19 hospital deaths, And don’t just focus on one model.
How do you know that? ONS stats have only just been published for w/e 24th April and they were down a bit on the week before?
Epidemic curve are flat or down in so many countries with such different mitigation policies that it’s hard to say this policy or that made big difference, aside from two – ban all international travel by ship or airplane and stop mass transit commuting. No U.S. state could or did so either, but island states like New Zealand could and did both. In the U.S., state policies differ from doing everything (except ban travel and transit) to doing almost nothing (9 low-density Republican states, like Utah and the Dakotas). But again, Rt is at or below in almost all U.S. states, meaning the curve is flat or down. Policymakers hope to take credit for something that happened regardless of their harsh or gentle “mitigation” efforts, but it looks like something else –such as more sunshine and humidity or the virus just weakening for unknown reasons (as SARS-1 did in the U.S. by May). https://rt.live/
I started distancing myself before the end of January when I was abroad on holiday in Tenerife with no known cases. But someone coughing next to me? I reacted. Also kept away from vulnerable people on my return home. Surely others behaved likewise long before lockdown?
I also did the same at the end of January / early February.
Frank, the peak Flu season is December through February, which is about the same amount of time that we’ve officially been recording deaths in the U.S. from the SARS-CoV-2 pathogen (February through April). Likewise, regarding a lockdown vs. no lockdown scenario comparison, that is also offset by the vaccine vs. no vaccine aspect of these two pathogens.
Please keep in mind that we’ve had numerous Flu seasons where between 60,000 to more than 100,000 Americans have passed away due to it, all despite a solid vaccination program.
“Flu season deaths top 80,000 last year, CDC says”
By Susan Scutti, CNN
Updated 1645 GMT (0045 HKT) September 27, 2018
https://edition.cnn.com/2018/09/26/health/flu-deaths-2017–2018-cdc-bn/index.html
Yes but the manner in which they count COVID-19 deaths is flawed. Even with co-morbidity they ascribe to COVID, and in cases where they do not test but there were COVID-like symptoms, they ascribe it to COVID according to CDC.
Most governments are busily fudging the numbers up, to ex-post “justify” the extreme and massively damaging actions they imposed on communities and to gain financial benefit (e.g. states and hospitals which get larger payouts for Wuhan virus treatment than for treatment for other diseases).
As with “global warming”, the politicians, bureaucrats and academics are circling the wagons together to protect their interlinked interests.
You are confusing deaths ASSOCIATED with Covid with EXCESS deaths resulting from flu. If all those who died of pneumonia, cancer, heart disease were routinely tested for flu we’d find that hundreds of thousands die WITH flu every year, though not as a direct result of it.
Right, but you’re comparing apples to oranges. Compare Covid-19 to other pandemics, like 1917, 1957, or 1968.
May be it is not “despite” but “because of”?
If you start the lockdown as late as March, then you ensure that infection and death rates are going to be higher because of high dosage and fragile immune system that comes from lockdown.
There are plenty of countries without lockdown to compare against. So it is not an unverifiable hypothesis.
“The virus has been in Europe since December at the latest” https://www.sciencedirect.com/science/article/pii/S1567134820301829?via%3Dihub
Oh yes. The model is all rather irrelevant now as we catch up on burying the dead.
In point of fact a ten line logistic model does as good a job.
Still, academic coding is usually a disaster. I went back to grad school in my late thirties after twenty years of software development. I should have brought a bigger stick.
“I should have brought a bigger stick”.
A PART, maybe? “Professor Attitude Realignment Tool”
SARS-CoV-1 (SARS) and SARS-CoV-2 (covid-19) both bind to the same receptor/enzyme (ACE2), causing increased angiotensin II (as it is no longer converted to angiotension 1,7 due to reduction ACE2 receptors/enzyme) and a cascade of pathological effects from that, causing: pneumonia, ARDS, hypoxia, local immune response, cytokine storm, inflammation, blood clots. SARS has a mortality rate of 10%, why would covid-19 be on a par with flu and not higher given it has the same/similar pathology as SARS ?
I have not seen the model or intend to do either. On thing that rang alarm bells with me was the statement that R0 was an input into its calculation making it a feedback system. These types of dynamical systems are known to exhibit truely chaotic behaviour. Even when not operating in those chaotic regions, the numerical methods must be chosen carefully so that they themselves do not introduce artificial method-induced pseudo non-deterministic behaviour (small differences in the initial conditions or bugs such as use of uninitialised variables)
The modellers argument would be that life is chaotic, and introducing a virus to two separate but identical towns could indeed result in very different outcomes.
Which makes me wonder about the validity of modelling chaotic systems at all…
I think the practice could make sense. The input R0 might describe how communicable the disease is without countermeasures, while the output R0 is the resulting communicability with the countermeasures being modelled. Nowhere in the article does it actually say the output is used as the following run’s input, and while I agree that’d be illogical and give huge swings in outputs (e.g., perhaps converging on infinity!) there’s no sign that’s being done. Is one of the top five critiques we can make of this code, that if used in a manner it’s not being used, it’s output would go crazy?
The value of your comprehensive reply was completely invalidated when you declined to provide your own calculations!
Yeah you’ve written millions of lines of code in dozens of languages, but didn’t read the review carefully. There’s a difference between randomness you introduce, which you can reproduce with the correct seed, and bugs which give you random results. You can’t just say, ‘oh it’s stochastic’, no, it’s bug ridden. They don’t understand the behaviour of their own model.
Saying it’s crappy because it’s 30 years old is nonsense. You can’t then use your crappy, bug ridden code to influence policies which have shut the economy down.
Unix is 50 years old. And IBM mainframe operating systems even older. And CICS…
Software on which the world runs every second of the year.
Oh for heaven’s sake. Have you read the Linux kernel? It only even begins to work because people live and breathe it. It wouldn’t pass structured programming 101. Linux specifically discourages comments.
And how many systems are running Unix (as opposed to Linux) nowadays ?
The review is a code review, not a review of the mathematical model, so I don’t see that one would expect it to present the substance of the model in any detail.
” There are EU countries which have conducted tests of large random, unbiased, samples of their population to estimate what percentage of their population has had the virus. The number – in case of those countries – comes out at 2%-3%. If the same is true of the UK, then 30,000 deaths would translate to 1 million deaths if the virus infected everybody. ”
Antibody tests indicate those people who have been sufficiently susceptible to the virus for their innate immune systems and existing T-cells to be unable to defeat the SARS-COV-2 virus, resulting in their slower-responding adaptive immune systems generating antibodies against the virus. But there are potentially a much larger number of people whose innate immune systems and/or existing T-cells are able to defeat this virus, and have done so in many cases, without generation of a detectable quantity of SARS-COV-2 specific antibodies. That seems the most likely explanation for why the epidemic is waning in Sweden, indicating a Reproduction number below 1, contrary to even the 2.5% probability lower bound of 1.5 for the Reproduction number there estimated by the Imperial College team using their model (https://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/report-13-europe-npi-impact/).
It seems to me that most of your comments are excuses for practices that were poor at the time, let alone now. Most of them simply reinforce the view that the code should have been ditched and rewritten top to bottom years ago as being no longer fit for purpose, if it ever was. Opportunities or signals to do so: move from single to multi-thread machines; publication of new/revised libraries with different flavours; discovery of absence of comments (!); discovery that same input does not yield same output (when it’s intended to); etc
Incidentally, “… [no] reason to think these bugs existed in the original code or that they were material.” which is precisely why we need to see the *actual* code that produced the key reports leading to the trashing of our economy and the lockdown with its consequential deaths.
Personally, I don’t think programmers necessarily criticise old code so long as it does what it claims to do. They may not like or understand the style but they can accept that it works. But here’s the thing: if it doesn’t do what it claims, then the gloves are off and they will come gunning not only for the errors but the evident development mistakes that led to and compounded them.
Ferguson said his code was written 13 years ago, not 30. Even so, 30 years ago undocumented code was still bad practice even if that’s how some programmers worked. Unless Ferguson can provide evidence that his original code underwent stringent testing then there’s little reason to trust it. But if it was tested properly the question still remains whether the model it implements is a reliable reflection of what would happen in reality.
question what his predictions for: BSE, Swine Flu, Avian Flu in the past were compared to reality.
hint: his predictions were worse than asking Mystic Meg.
the code was written 13 years ago, not 30.
It was a different time is no basis for a defence and your comments are a defence. They either thought their code worked or they didn’t. This shows that they didn’t. That’s all that matters. As for your fear. That’s yours to deal with. Sounds like you’ve got issues to me.
Sorry, but this is an absurd criticism. We have all seen old legacy code that needs refactoring and modernization. Anything that is mission critical for a business, in medicine, in aviation, etc., will often have far more testing and scrutiny applied to it than the actual act of writing the code because either huge amounts of money are at stake, or even more importantly, lives are at stake. For this kind of modeling to be taken seriously, a serious effort should have been made to EARN credibility.
There is simply no excuse for Ferguson, his team, and Imperial College for peddling such garbage. I COMPLETELY agree with the author here that “all academic epidemiology be defunded.” There are far superior organizations that can do this work. And even better, those organizations will generate predictions that will be questioned by others because they are not hiding behind the faux credibility of academia.
dr_t,
Linux is nearly 30 years old. What’s your point again?
And Linux – although legally unencumbered – is essentially a Unix-like operating system. And Unix dates back to 1970.
I started work as a trainee programmer for a commercial company in 1971. The first thing I learned was ‘comment everywhere, document everything’.
And I doubt whether there is much/any original code there now.
Given the lead-in remarks here, I wonder if this commenter is just trolling us.
Given the fantastical view of software development 30 years ago, I wonder if he really knows that much about software development? Comment free code? 15,000 line single-source files? GMAB! Kernighan and Plauger were complaining about standard Pascal’s lack of separate compilation 40 years ago when they rewrote “Software Tools” as “Software Tools In Pascal, stating that while it might be better for teaching, that lack made it worse than even Fortran for large scale programming projects.
I have a PhD in biochemistry and currently do academic research in systems biology. I have about 20 years coding experience. This kind of approach to statistical analysis is very familiar. I concur with dr_t.
The stochasticity is a feature not a bug; it is used to empirically estimate uncertainty (i.e. error bars). The model *should* be run many times and the mean/average and variance of the outputs are exactly the correct approach. Highlighting the difference between two individual runs of a stochastic model is only outdone in incorrectness by highlighting a single run.
You’re effectively criticizing the failure to correctly implement a run guarantee that wasn’t important in the first place. Based on your description it sounds like the same instance of an RNG is shared between multiple threads. Your RNG then becomes conditioned on the load of the cores, because any alteration in the order in which the RNG is called from individual threads changes the values used. If it’s accurate that the problem persists in a single threaded environment then it could be the result of a single call to any well-intentioned RNG that used a default seed like date/time. The consequence is only that parameter values are conditional on one random sequence rather than another random sequence. It’s irrelevant in practice.
Whether, as commenter MFP puts it, “the variation in output is the product of ‘intented’ pseudo-randomness or the product of unintended variability in the underlying process” is irrelevant. Variability *is* randomness. So intended and unintended randomness is an meaningless distinction. Non-randomness masquerading as randomness is the only important consideration, and such a mistake results in *less* variation in the results, not more.
The other thing to notice is that the difference between the two runs seems to be (almost) entirely a question of “onset”. That is, the curves are shifted in time.
You’d expect a model to be far more influenced by randomness “at the start” (where individual random choices can have a big effect), and so you shouldn’t be reading very much into the onset behaviour anyhow (c.f. nearly all the charts published show “deaths since 20 deaths” or similar, because the behaviour since the *first* death has a lot of random variation). If this is what’s actually happening (and it certainly looks like it to me) the people making the critique are being fairly disingenuous not to point it out.
To be clear: I don’t think the non-reproducibility (in a single thread environment) is good, and it’s a definite PITA in an academic environment, but I’m doubting it makes any substantial difference to the results. “80,000 deaths difference” looks to be massively overstating things, when more accurate would be “the peak death rate comes a week later” (with the final number of deaths the same).
And even if 80,000 was accurate, it’s only a 20% difference. There are lots of input variables we’d be ecstatic to know about to 20% accuracy (R0, IFR, etc.), so that level of uncertainty should be expected and allowed for anyhow.
There may be other more serious flaws in the model, and I wouldn’t be surprised if some fundamental assumptions are wrong that make a much bigger difference – we are in uncharted territory. But this particular one doesn’t look to be serious.
While we can debate the reviewer’s understanding of stochasticity used in this model, there doesn’t appear to be much debate about the quality of program/model itself. Put another way, it does not matter if the correct ideas were used in the attempt to create a model if the execution was so poor that the results cannot be trusted.
As an academic, I would expect you to be appalled that the program wasn’t peer reviewed. I can only hope that your omission here does not represent a tacit understanding that such practice is customary. But I suspect such hope is misplaced.
All of the modern standards (modularization, documentation, code review, modularization, unit and regression testing, etc.) are standards because they are necessary to create a trustworthy and reliable program. This is standard practice in the private sector because when their programs don’t work, the business fails. Another difference here is that when that business fails, the program either dies with it or is reconstituted in a corrected form by another business. In an academic setting, it’s far more likely that the failure will be blamed on insufficient funding, or that more research is required, or some other excuse that escapes blame being correctly applied.
I’m not going to defend coding practices as such in the academy. Just realize that modularization, documentation, code review, etc. become much more burdensome when the objective of the code is a moving target. This is how it is in a basic research environment where the how is, by definition, not known a priori. How do you plan the programming when the solution is unspecified until the very end. The solution itself is what the research scientist is after, the implementation is just a means to that end. The code is going to carry the legacy of every bad idea and dead end that was pursued during the project.
This will always be a point of friction because once the solution is found it always looks straightforward and obvious in retrospect. A professional coder can always come in after all that toil and failure and turn their nose up at all the individual suboptimal choices scattered throughout. This happens constantly; a researcher develops a novel approach that solves 99% of the unknowns and then a hotshot software engineer comes in and complains that there’s still 1% left and if s/he had written the program (now conveniently armed with all the theory that was the real product of the research) it would run ten times as fast and account for 99.1% of the uncertainty. Come on. It’s a well-known caricature in research environments.
Go ahead, review and rewrite the Ferguson group’s code. Will the program run better? Definitely, probably a lot better. Will it be easier to understand? Yes. Will the outputs be exactly the same? No. Will they differ to such an extent that the downstream political decisions fundamentally change? *Very, very unlikely.*
Look, you want your opinions to have merit, then carry the burden. That’s what he rest of us have to do. Moreover, it’s very, very likely that much of the code could be modularized for reuse and that the tweaking can be done systematically in a subset of modules.
What you’re describing is akin to an actual scientist puttering around in a lab and then telling the world they have found the solution while at the same telling the world it’s too complicated to explain or document along the way, so just trust the results. Just another reason why this process fails the basic principles of the scientific method.
Well, current agile development practices do this continuous “problem discovery” all the time, but with sustained code quality at every commit (or at the very least at every pull request).
Well you clearly are fresh off the boat. Academic source code is uniformly shit. It is very rarely provided, and never “peer reviewed”. “Peer reviewal” isn’t paid, it’s an extra “voluntary activity” done in one’s free time. You seriously think scientists have so much money that they’ll spend weeks peer viewing each others’ 15K line files looking for bugs?
That’s why the open source approach is valuable.
It is simply a) wrong and b) stupid to pretend that every call to an RNG is an instance of a statistically independent and uncorrelated variable.
It is wrong because it is untrue, and it is stupid because it makes it a nightmare to maintain reproducibility of results in an evolving project.
If you want to see a serious engineering treatment of RNGs and noise in integration problems, look to computer graphics, where the difference between white and blue noise is crucial for instance, and the difference between theory and practice can be huge due to quantization and sampling effects.
I was programming point of sale and some financial software about 40 years ago so I agree with your point that it was very different – a few K of RAM and a few years later a massive 10 megabyte hard drive!
However Stochastic still equals random and we can’t do what we’ve done on random information.
Good luck with hiding from a Caronovirus! It was right across the UK weeks before lockdown and will, in my view, be asymptomatic in between 30 to 60% of population. My guess is as good as any guesswork produced by predictive, stochastic models!
“I made my own calculation at the end of January, estimating the likely mortality rate of this virus. I’m not going to tell you the number”.
So, in other words, you are just like Ferguson: You made a prediction, which might have been reasonable at the time, but you won’t show your workings (and you won’t even tell us the prediction) but now you’re going to stick with it no matter what. That’s terrible science.
The latest meta analysis of Sero studies:
https://docs.google.com/spreadsheets/d/1zC3kW1sMu0sjnT_vP1sh4zL0tF6fIHbA6fcG5RQdqSc/
shows an overall IFR in the region of 0.2%, higher in major population centres. For people under 65 with no underlying health conditions it’s more like 0.02%. Research from the well-respected Drosten in Germany suggests perhaps 1/3 of people have natural immunity anyway:
https://www.medrxiv.org/content/10.1101/2020.04.17.20061440v1
Did you factor this in?
If your estimate is different to this, it’s looking increasingly likely that your estimate was wrong. Have you back-casted your estimate, perhaps using Sweden or Belarus as references?
Well said, Dr_t!!! Exactly my sentiments – from someone who started FORTRAN modelling 50 years ago and has continued through today.
I would describe this as simplistic and superficial critique – not really adding anything material to the discussion.
For those who don’t agree with a stochastic modelling approach, tell me from where you have “typical lock down behavioural patterns” for a truly probabilistic model. Nonsense!!!
Go back to the drawing board and come up with some useful and materially significant comments.
30 years ago I was developing the Mach operating system (the thing that runs Apple computers today). Written in ‘C” I can assure you that it was multi-threaded, modularized, structured and documented. Multi-cpu computers were already commonplace if not on the desktop. Dining philosophers dates from 1965 and every computer scientist should have come across that at university for the last 50 years. Multithreading has been available to coders since at least the days of Java (1995) if not before (it doesn’t require a cpu with more than 1 core just language and/or OS support).
I went to university in 1988, and one of the 1st year modules was concurrent programming. We used a language called Concurrent Euclid (a pascal clone with threading) possibly because threads weren’t well supported or were awkward to use and understand in other languages. Multi-threading programming in mainstream systems has been around for a long while.
Indeed and I remember Modula 2, another Pascal derivative, supported threads. Concurrent programming is pretty old hat really.
Yes, and I too wrote multi-threaded software in the 1980s, including a thread scheduler I wrote in 8086 assembler for the IBM PC, and used Mach on my NeXT and Logitech Modula-2 on my 80286 PC clone (though I’m pretty sure that version only implemented co-routines, not real concurrency).
But I think you may be missing the wood for the trees.
Firstly, who in 1990 had a CPU capable of executing multiple threads in hardware simultaneously? Not on PCs. Not even on workstations like NeXT, Sun, Apollo, etc. I lost track of what the minis could do by that stage, but hardly anyone was still using them. More likely, you literally had to have access to a mainframe – an even smaller set of users.
Outside computer science nerds and academia, multi-threaded programming was not in general use.
There is no benefit to an end user like an epidemiological modeler using commodity hardware in using multi-threaded code if it’s going to run on hardware capable of only executing a single thread at one time. His objective is not to show off his computer science knowledge and skills but to get the results of his simulations. Therefore, it makes absolute sense to write software which works in a single-threaded environment only. I’d even go further and say that someone writing multi-threaded software that’s less efficient than the corresponding single-threaded software, just to show they can, is demonstrating a lack of ability to put his software engineering skills to proper use in practice.
At any rate, it seems that the software is 13, or perhaps 15, years old, not 30, and by 2005-2007, multi-core CPUs were just starting to become available on commodity hardware. But they were not universal and the number of cores was still relatively low, and multi-threaded programming would still not have been in common use among end users like epidemiological modelers, and was indeed far from being in universal use even among computer science nerds.
Given the complexities involved in writing correct multi-threaded code in a shared memory model, the limited benefit this would yield, and probably the amount and nature of the legacy code he had to work with, it doesn’t surprise me that Ferguson’s code remained written for a single threaded execution environment only.
And finally, let me say this again, Ferguson is not a software engineer. He is an academic, probably with very limited, if any, coding support, and his expertise lies somewhere else entirely. If someone attacked you for your lack of expertise in the field of epidemiology, I bet you wouldn’t think much of that.
I’m still waiting for a substantive analysis of his modelling work.
As for his estimates, to quote the Daily Mail (https://www.dailymail.co.uk/news/article-8294439/ROSS-CLARK-Neil-Fergusons-lockdown-predictions-dodgy.html): “Other researchers made the same prediction.”
He was not alone in forecasting 250,000-500,000 deaths if we did not take any steps to mitigate the pandemic. My own estimates were, and still are, higher (no, I’m not going to share them with you, they are for my private use only and I’m not on any government payroll, so there). You can debate whether the measures the government adopted were the correct ones (I certainly think they were not and that they have been incompetent) but I think it’s pretty clear that if they had continued to follow the herd cull policy, or if they adopt it again, the results would be disastrous.
I think some peripheral and superficial criticism of Ferguson’s programming ability is being (mis)used to try to change government policy.
Ferguson belongs to the political left. I do not. He also broke the lockdown rules, as a result of which he had to resign. He is gone.
This, and his possible lack of programming skills, does not per se mean he is a bad epidemiological modeller.
It also does not mean his predictions are wrong.
It certainly does not mean we should adopt the herd ‘immunity’ policy (I strongly prefer ‘herd cull’, because I believe it is more accurate). We should not.
Your stupid “””model””” clearly failed to take into account asymptomatic cases (between 60 and 80%). Maybe you ought to look at Iceland since they’ve done testing on 100% of their population, albeit still using low-specificity tests. Say, how come during the same time period in the US, 10% of the population contracted influenza but only 0.3% contracted COVID-19? I thought COVID-19’s R0 was many times higher than the influenza viruses..? Pro tip: infections are WIDELY underestimated, meaning CFR is widely overestimated.
Iceland. Where their total population is about the size of Oakland CA. USA. A country isolated most of the time and more so in the winter. A country that is basically 1 racial group. A country without a thriving economy of world travel, and imports and exports on a grand scale.
Sure, why not. If you are going to slice and dice based on massive disparities in population and area, I’ll use US states Oregon and Arkansas. Same # of deaths per million as Iceland. I would imagine both those states get more economic “Action” than Iceland.
Completely agree.
It should also be noted that this ‘bug’ has been fixed – https://github.com/mrc-ide/covid-sim/pull/121
The very fact that the model code was USED now correctly lays it open to review and criticism, the same as if it were written yesterday, particularly as it has a direct affect on the wellbeing of millions NOW. If it’s not fit for purpose, it doesn’t matter how old or new it is.
Ferguson wrote this on his Twitter account a few months back: “I wrote the code (thousands of lines of undocumented C) 13+ years ago to model flu pandemics.”
So it more like 13 years old – not 30 years old.
“30 years ago, there was no multi-threading, so it was reasonable to write programs on the assumption that they were going to run on a single-threaded CPU. ”
Well yes. I am involved in a big upgrade to academic software to multithreading for the same reason. But we are extensively testing and validating this before even considering using it. Sounds like Ferguson’s group did this, found differences that indicated the single threaded code had wrong behaviour and then ignored it. So the problem is not lack of multi-threading, its lack of good testing and responsible behaviour (not using code you know is dangerously wrong)?
Very interesting. I know nothing about the coding aspects, but have long harboured suspicions about Professor Ferguson and his work. The discrepancies between his projections and what is actually observed (and he has modelled many epidemics) is beyond surreal! He was the shadowy figure, incidentally, advising the Govt. on foot and mouth in 2001, research which was described as ‘seriously flawed’, and which decimated the farming industry, via a quite disproportionate and unnecessary cull of animals.
I agree with the author that theoretical biologists should not be giving advice to the Govt. on these incredibly important issues at all! Let alone treated as ‘experts’ whose advice must be followed unquestioningly. I don’t know what the Govt. was thinking of. All this needs to come out in a review later, and, in my view, Ferguson needs to shoulder a large part of the blame if his advice is found to have done criminal damage to our country and our economy. This whole business has been handled very badly, not just by the UK but everyone, with the honourable exception of Sweden.
Thanks for your words of wisdom (I truly think they are). Nevertheless, for me (if true) the main point of the critique is: same input -> different output, under ceteris paribus conditions. Best regards and luck in your lockdown.
None of what you say excuses the use to which this farrago of nonsense has been put.
I’m not sure that the code we can see deserves much detailed analysis, since it is NOT what Ferguson ran. It has been munged by theoretically expert programmers and yet it STILL has horrific problems.
I don’t know how you code, but I’ll stand by my software from 40 years ago, because I’m not an idiot and never was. Now … where did I put that Tektronix 4014 tape?
In my field, economics, 61-year-olds like me face the problem that the tools are different from what they were 30 years ago, but we old guys can’t use that as an excuse. To get published, you have to use up-to-date statistical techniques. It’s hard to teach an old dog new tricks, so most of us stop publishing.
Your point that 30 years ago, programs didn’t have to cope with multiple cores sounds legit— but the post above seems to be saying that’s not the main problem, and it wouldn’t even work if run slowly on one core.
The biggest problem, though, is not making the code public. I’m amazed at how in so many fields it’s considered okay to keep your data and code secret. That’s totally unscholarly, and makes the results uncheckable.
I think their code is available and what third parties such as the University of Edinburgh and Sue Denim (!) have found when scrutinizing it is that it’s pretty poor. Following the science sounds sensible but when they are employing such poor models it’s not sound after all.
Thanks for your words of wisdom (I truly think they are). Nevertheless, for me (if true) the main point of the critique is: same input -> different output, under ceteris paribus conditions. Best regards and luck in your lockdown.
None of what you say excuses the use to which this farrago of nonsense has been put.
I’m not sure that the code we can see deserves much detailed analysis, since it is NOT what Ferguson ran. It has been munged by theoretically expert programmers and yet it STILL has horrific problems.
I don’t know how you code, but I’ll stand by my software from 40 years ago, because I’m not an idiot and never was. Now … where did I put that Tektronix 4014 tape?
“I’ve been writing software for 41 years.”
” I have written millions of lines of code in dozens of different programming languages. ”
Pffft! 41 years is 21.5 million minutes. There is no way you have written that much code, much less in dozens of languages.
You may have some valid points but I’m not going to take the chance.
First of all, you are calling me a liar which I do not appreciate.
Secondly, your analysis is nonsense and any competent software developer will know that.
I’ve never counted how many lines of source code I can write per minute, as it is a pointless metric, though it will be quite a few. But I do know that writing 1000 lines of (tested and debugged) code in a day is a slow day with plenty of time to spare for other things.
In order not to speak purely in the abstract, there was an event (a coding contest) which took place in the 1980s where I was given a problem specification and had 24 hours (during which time I also had to do things like eat and sleep) to design a programming language and implement an interpreter for it. I still have the source code and it is 2547 lines long. Yes, it is spread over 18 modular source files, the longest two of which contain 451 lines (the parser) and 335 lines (the lexer), respectively. No, it is not multi-threaded. It is probably all re-entrant and so thread-safe.
There are 365 days in a year. Programming is not what I now do, but I do still write at least some software every day. Writing 365,000 lines of source code per year is therefore not a lot. I have never counted the total number of lines of source code I have written, as I don’t see the point, but I do know it is well into the millions of lines, and because it is so very comfortably so, I don’t need to count them, and it would be very hard to do so.
A few dozen programming languages is not a lot. Anyone who has to use computers regularly and professionally over an extended period of time will pick up that many and more as a matter of necessity.
I also don’t see why this matters to you so much. I merely point this out to set the context. Again you are focusing on the superficial and ignoring the substance, which are my arguments, which you obviously haven’t bothered to read. I do realize that requires an amount of mental ability, which not everyone possesses.
1000 LOC per day ? Tested and debugged ? Blimey !!
I have a lot of experience with simulations and stochastic models. But I’m an engineer, not an academic.
In the field, if you cannot explain every bit of randomness in your model, you do not understand it. This has nothing to do with “modern” code or not, because 30 years ago, the requirements of responsible engineering were exactly the same as they are today. If a company builds 20 bridges, and 1 of them falls down, we don’t call that a 95% success rate, we call that irresponsible and unacceptable failure.
The multi-threading is particularly important, because of the random seed. It doesn’t matter if you generate the same sequence of random numbers every time, if the order they are used in is non-deterministic. You’re effectively randomly swapping certain pairs of numbers in your sequence, every run.
This also makes it harder to improve and refactor the code. If you have only a single random generator, then each call to
random()
depends on all of the ones that came before. If you instead use an independent generator for each unique aspect of the model, now it doesn’t matter which order you process them in, the randomization does not cross component boundaries. This can even be applied at the individual decision level using a carefully constructed tree of seed derivations, which maps to the actual dependencies of the model.This requires a holistic understanding of your code, and skill in architecting data flows. That’s not going to happen in a 15k pile of academic hell-code.
A good defence of Neil Ferguson being ‘shy’ of releasing 30yr old coding but with all due respect(I mean that) he should have indicated as much, now he should release the original code with the disclaimer that it was written as you describe above.
“Are you saying you have a model which can predict, with certainty, how many dead people there will be 2 weeks from now?”
No one is criticizing it on that basis, the issue is that, generalize the program as a function f(x), where x is some random seed value, every other random value in the program should be a function of x, so that f(x) = y over every run. There’s no reason that it shouldn’t run this way, if the author’s contention that, call every run f_t(x), f_0(x) = a, f_1(x) = b, f_2(x) = c, but that a, b and c converge on some number, that may well be the case, but there’s a very big difficulty in determining what it is that they converge upon.
Now, if you were writing something to simulate an empirical situation, where you were able to check your algorithm against the real world, and you found that the average of a,b,c did in fact converge on the real world observations, sure, that’s a sort of validity, but it’s still needlessly bad form, tho, of course, I am well aware that the best program is one that solves the problem and costs as little as possible.
If we apply this test to Ferguson’s model, we find that his program does not, in fact, in any way approximate the data we’re getting from the real world, so it’s not like the average runs are converging on what we’re seeing on the ground—it’s as though this is some sort of financial analysis program for picking stops, and it’s saying stock X is going to go up to Y, when it really only hits 1/10th Y—I wouldn’t use that program to pick stops, would you?
So it is not merely a criticism of the form of the program, nor is it that you cannot (in an absolute sense) have a useful program where f(x) =y is not true for every run of f, but the proof of the pudding is in the eating, and what we’re being fed by Ferguson and these academic modelers (makes one wonder what “Climate Change” code looks like!) is not top-shelf, not world-class, it’s the sort of program you’d get where there is vendor lock-in to the point of tenure. I mean, can you imagine the quality of software you’d get if a programmer had _tenure_?
I am a machine learning engineer (a PhD) and I confess I haven’t read the paper at all and I am not interested to do so either (and I don’t have the biology or virus knowledge at all). As much as anyone else, I love the restrictions to be eased or lifted, but I try to be unbiased in giving comments on these. Apparently I haven’t read the paper so I would take my own word with a grain of salt!
When it comes to modelling, it is normal to use stochastic models (which is a very good idea to do so for such a case in my mind, in comparison to a deterministic model), and getting different results is probably because they have forgotten to pass the seed parameter to one of the random functions, not an actual bug (an educated guess).
Again, when it comes to modelling and when you are working on data, you can spend your whole life writing unit tests, but it becomes so hard (and exponentially growing) so quickly that it is not possible to cover all different cases and is not worth the time to write tests. Google has given up writing tests on machine learning models (not sure if their model can be put under an umbrella term though).
From a normal developer’s perspective, a code without tests that might give different results is not a trustworthy code (as someone working at a consultancy firm, I am dealing with normal developers every day, so trust me on this), but from an ML expert lens, this is pretty normal. To be fair, I just don’t think it is fair to attack a modelling work only by the metrics described here.
I am not suggesting the model should be adopted without further investigation (and who am I to judge with such limited knowledge), but I also think it is unfair to demonstrate the model as trustworthy by the criteria explained here either.
These are just my thoughts (and personal).
This is an utterly bizarre take.
The REASON that coding standards have changed is precisely because of the problems that is inherent in monoliths. You don’t get to say, “we shouldn’t hold his outdated code up to modern standards, it’s not fair to criticise it as if it was written today”. You instead have to say, “this moron is using 30 year old code base and totally outdated, obsolete, and rightfully abandoned coding practises.”
“On a personal level I’d actually go further and suggest that all academic epidemiology be defunded. This sort of work is best done by the insurance sector. Insurers employ modellers and data scientists, but also employ managers whose job is to decide whether a model is accurate enough for real world usage and professional software engineers to ensure model software is properly tested, understandable and so on. Academic efforts don’t have these people and the results speak for themselves.”
Perhaps even more significantly, they pay a price when they get it wrong, a check on overreaching idiocy that appears completely lacking in these “advisory” academic roles in government.
See also https://www.youtube.com/watch?v=Dn_XEDPIeU8&t=593s Nassim Nicholas Taleb on having Skin In the Game.
On Monday I got so angry that I created a change.org petition on this very subject.
https://www.change.org/p/never-again-the-uk-s-response-to-covid-19
It sounds like something an undergrad would knock together, but this team is supposed to be the cream of their profession.
If this is the best the best can do then to ‘suggest that all academic epidemiology be defunded’ sounds like a good plan to me. But, sadly, this is shutting the stable door after the horse has bolted.
and exceedingly well funded (by Gates and others). No excuses at all for old or poor code.
non whatsoever.
“Gates”, “poor code”! Now where have I seen that before?
We should not assume that the cream of the academic crop knows how to develop industrial strength software, or at least we’ll written code. Proper software development techniques are usually NOT taught in academia.
We should not assume that the cream of the academic crop knows how to develop industrial strength software, or at least well-written code. Proper software development techniques are usually NOT taught in academia.
Thank you. Are the mainstream media capable of covering this? That is what frightens me.
Who is going to be the first to point out that the reason sick peoples weren’t getting hospital beds is because the models were telling us to expect thousands more sick people than there were? How many people died because of this?
And what about all this new normal talk? All these assumptions life will change for ever built on fantastic predictions which are being falsified by Swedish and Dutch data?
This diktat that we can’t set free young people who are not threatened by the virus because the model says hundreds of thousands would die? All nonsense.
This is the greatest academic scandal in our history.
‘Are the mainstream media capable of covering this?’ Let me think………. ‘No’.
They are certainly capable, but is it in their interests to? Not until the wave turns and is racing back towards them to swamp their current rhetoric. Then they’ll go into self-preservation mode, and make you believe they were asking this all along.
Slightly off topic but I would suggest that some of the climate science work suffers from similar problems and at a comparable scale. Dr Mann’s flawed hockey stick comes to mind; my understanding is that the analysis code has never been released.
I am science trained but a HW guy, not SW. I place most of my trust in measurements, especially ones that can be reproduced by others.
“I would suggest that some of the climate science work suffers from similar problems”
The infamous “Harry_Read_Me” file contained in the original Climate Gate release springs to mind. As I recall, it was a similar tale of a technician desperately trying to make sense of terrible software & coding being used by the “Climate Scientists” – one of whom had to ask for help using Excel…
Currently in court charging defamation but needs to provide disclosure (his ‘code’) and is kind of having cold feet – so proceedings drag on
MUCH more politics in Climate Change! You are simply not allowed to question the basic assumptions..
Er… “much more politics” than the model that has been used to shut down most of the world?
…the assumptions – built into MODELS!!
What, that CO2 absorbs infrared ?
Any virus has an inherent R0 for a constant set of conditions (input R0). It also has an effective average R0 in the population for the given social conditions (output R0). Hence this explains why R0 appears as input and output.
I would call it
R0 inherent (input)
R0 effective (output)
How do you arrive at the R0 that you feed in?
R0 is a number that is calculated from other model parameters (contact rates, migration rates, recovery rates, death rates, etc.). It has no value to be fed in. The model parameters are fed in, and R0 is some function of these parameter values. Also, too much emphasis is placed on this mystical “R0”, as if an entire epidemic is controlled by one number. This is plainly ridiculous.
Mathematical models of epidemics are just simplified representations of reality. One can fit them to data, once one has data, and the fit may be impressive. But as a predictive tool, in the absence of much data, they may be useless. Having experience of mathematical modelling of epidemics, and knowing their limitations, it is bewildering how countries around the world have imposed all these silly lockdown measures, seemingly because of one computer program by someone who isn’t even a mathematician or programmer.
It seems to me that politicians, afraid of appearing ignorant when their academic “professor” buddy told them everyone was gonna die, and being pressured by an increasingly hysterical media reporting every individual case of coronavirus, decided lockdown was a good idea. Of course, as time will tell, it was never a good idea. And it won’t be in future either, if another new disease comes along.
Perhaps I should re-phrase it as “How do you arrive at the parameters you feed in?”. Ferguson speaks of trying different R0 values over a specified range, so presumably his model does have some notion of R0 as an input. However, it could be that he simply sweeps the model with different transmission parameters and observes which one effectively produces the R0 that he wants.
Either way, he is starting with a range of values of R0 that has been obtained from somewhere. Maybe it’s just that “everyone knows that SARS-Cov-2’s R0 is about 2.5”. But where did that come from?
I think it comes from fitting a ‘model’ (maybe just an exponential formula) to real data (typically at the start of the epidemic – although that’s an assumption in itself) and adjusting its R0 for best fit. As others have observed, if the early data all comes from hospitals, and is affected by arbitrary factors like availability of tests, choice of subjects etc., then that R0 is already very ‘wrong’.
Your last paragraph Mr Cabbage is spot on. We seem to have employed and accepted flawed evidence which inevitably leads to the wrong conclusion. In a scenario such as this is too important to indulge in such activities.
That’s not a conclusion, that’s a recommendation.
Agree that a proper epidemiology model that is robust and peer reviewed is required and should be a good outcome from this pandemic.
As someone who has worked in the areas of Software Maintenance, Legacy Systems, and Software Testing, and has taught Computer Science to MSc level, I have to say I am appalled. A Computer Science student could do much better than this. Why is Prof Ferguson still being employed by the once prestigious Imperial College?
“Why is Prof Ferguson still being employed by the once prestigious Imperial College?”
About that….
https://www.bbc.com/news/amp/uk-politics-52553229
They could write better code, yeah. But they wouldn’t understand the epidemiology bit. Brogrammers…
But he does. It’s called working in a team.
Interesting, I downloaded what purported to be the Imperial Model software from github and it was in Python. Full of hard coded numbers seemingly pulled from the ether. Didn’t see any C++ in there
I think the original code was written in C++ (Ferguson said C) but I read somewhere that it was ported to R and Python recently – presumably that was the work done by Microsoft.
Although there is R in this, that’s for analysis and display. The one being discussed here is in C++. There is *another* model from Imperial College (the one for “Report 13”) that’s essentially the implementation of an analytical model, and that uses Stan, Python (to set it up) and R (for analysis and display). That’s not the one described here, which is elsewhere on github.
I found that version too. Similar coding style but I could not get it to work – lots of NULL pointer accesses!
Oh good, it will be fine if Microsoft have done it.
Yes, my reaction precisely. Thank goodness no Microsoft products have ever had any bugs.
MS have done some good work in software quality improvement.
The .cpp files are in the src directory: https://github.com/mrc-ide/covid-sim/tree/master/src
This is stunning in how awful this all is. The word criminal comes to mind. Thank you so much for this assessment.
No do the same with climate change models
“Clearly the documentation wants us to think that given a starting seed, the model will always produce the same results.”
No! That’s not how stochastic simulations work! Or indeed the real world! In biological systems we literally *expect* a range outcomes given the same input. You run the model repeatedly, and then report an average, an the 95th percentiles of the results.
You absolutely, 100%, Do NOT want a model that gives the same results given the same inputs.
You might be software engineer, but you’re no biologist.
Explain more please: is this entire critique invalid?
Try reading more carefully. She said, given the same starting seed you get the same results. That’s exactly how a stochastic simulation is supposed to work. If you don’t you’ve introduced a bug, I mean ‘non-determinism’
That’s not what the review says. There was a brief period while there was a bug, but outside that period – both the original code, and the code once the bug was corrected – did not exhibit such behaviour in the execution environment the program was written for – a single threaded single CPU computer.
Name one piece of software under active development with a regular release schedule where every intermediate release is bug-free.
Trying to use monolithic 15000 line 30 year old (apparently) C code which is a result of auto-translation from Fortran on a multi-threaded multi-core computer or trying to get it to work in such a setup is a fool’s errand. Use the code on a single-threaded CPU, for which it was designed and judge it on its performance in such an environment. To be used in a multi-threaded setup, the software would have to be rewritten from scratch using the proper tools for parallel computing (hint: it isn’t tools using shared memory and mutexes).
I’m sorry, but I really am not persuaded by these criticisms which appear to be very superficial. What is the substance of the model? Is it wrong? Why is it wrong? Is it innovative and clever? Why is it innovative and clever?
This is what the reviewer should be asking, and is what would be interesting to know.
Unfortunately, I doubt the reviewer has the necessary skill set to evaluate the model from this – the proper – perspective.
I really don’t care about Ferguson’s coding style. He is not a professional programmer, he is an epidemiological mathematical modeler. How well he does this job is the relevant question, not his coding skills. Especially not his coding skills 30 years ago judged from a modern day perspective.
As an academic, I doubt he has an army of professional programmers working for him.
I’ve seen plenty of brilliant engineers who write programs to help them with their designs. Needless to say, their code is invariably a rat’s nest, but if it allows them to do their job well. Is it appropriate for professional programmers to try to judge those whose job is something else and who use programming only as a tool and try to show off their superior coding ability? I’d like to see such programmers do the engineer’s or the modeler’s job and see how they fare. Then we can all stand there and mock their efforts.
Absolutely spot on dr_t. You clearly know your business. I’m a bit shocked by the critique as well. As a stochastic modelling expert who has written many a ‘rat’s nest’, it is obvious to me that the seed bug which she makes a meal of is not an issue at all for this particular code as it depends on an ensemble of results. Of course, it’s nice to fix it to have reproducibiity of individual runs as that may confuse novice users, but from the perspective of the end result, it changes nothing.
By the way, my rats nest doesn’t stay that way. I work with a team of great developers who are pretty mediocre modellers. We work together to produce something that can be consumed by a fairly large body of non-expert users, but if it’s usage stayed with a handful of experts, we could save the expense of the refactoring and the glamorous user interface.
Absolutely agree with dr_t and earthflattener! You are both spot on with your criticism of the author of this misleading and erroneous report. There seem to be two camps forming here – the “IT geeks” focussed on the purity of code and the true “modellers” who are interested in concepts and theories, which is why they become modellers in the first place.
It’s like when learning about a water molecule being H2O – one oxygen with two hydrogens around it – a truly simple “model” of a water atom. Is this technically correct? Of course, NOT! Is it adequate to explain what’s happening without too many – significant – detrimental effects? Of course, YES! There’s no mention of neutrinos or other particles but it doesn’t invalidate the basic model – H2O – and how it’s applied.
Get a grip all you geeks! You just cannot see the wood for the trees – the author of the critical analysis should get some experience in “modelling” before writing critical commentaries. So here’s an exercise for all IT geeks. Do your personal budget for the next 3 years – forecast your income and expenditure. Let’s see if you can figure out your travel and holiday plans. That’s a very simple “financial model”. Do you use Excel or the “back of an envelope”. Does it matter which? NOT AT ALL!
What matters are the ASSUMPTIONS – which have no bearing on whether you use Excel or the envelope.
So easy to see through those who think “code” versus others who think “concepts”! The more I read the author’s article, the more manifest it becomes that she can only think “code” and has no clue about complex scientific, behavioural or economic concepts.
Sad!!!
Isnt the model generated on code, and of the code is wrong the model is wrong?
First, it’s a strawman argument to suggest that professional software developers expect some kind of code purity. Second, when you refer to professionals as “IT geeks”, you are attempting to undermine their professional credibility without addressing their merits of their concerns. It’s just banal rhetoric.
Expecting well organized and documented code is not an expectation of purity. It’s a best practice so that when bugs are discovered, they will be far easier to track down when the code is orderly and the programmer’s intentions are documented. Every professional “IT geek” understands this.
Look, we all write proof of concept routines when we’re experimenting with different ideas. Novice programmers tend to get so wrapped up with their project that they don’t take the time to rewrite their doodling into something more orderly and reusable. Experienced programmers learn from those mistakes.
Lastly, in case you haven’t noticed, the world runs on quality software. We literally trust our lives to it whenever we fly on a plane, for example. I can’t say the same for much of what the geeks in academia generate.
The point is, you want some way to know whether the code has bugs — whether the model is doing what it was written to do. If it’s poorly documented and untested, doesn’t reproduce itself (to some level of consistency) you can’t really.
That’s why this is a problem. Most well written systems using stochastics use pseudorandom numbers, that look random, but are fixed based on the ‘seed’ to the random number generator. With the same seed, they give the same results.
This didn’t, which is a sign that something broken is going on. With C++ that can be a lot of things. One of the most obvious is using an uninitialized variable. E.g. you are summing numbers, but you forget to set it to zero at the beginning. Often it will be zero, but sometimes it won’t be. This introduces a bug, and non-determinism, and means your results generally can’t be trusted.
There are actually a lot of good static analysis tools for C++ — I’d love to see them applied to this code base.
the only concept that really matters is results.
Have Dr Ferguson’s results been valid in the past?
Were his predictions for BSE deaths, avian flu deaths and swine flu deaths using this modelling software borne out as roughly correct when looking back at the actuality?
The answer is that Mystic Meg would have done a better job than his software.
I have some tea leaves that did a better job. Should I put them forward for a knighthood?
“How well he does this job is the relevant question, not his coding skills. Especially not his coding skills 30 years ago judged from a modern day perspective.”
respectfully, if one’s brilliant mathematical modelling skills are encoded in ways that undermine the ability to produce coherent/consistent and applicable results based on that model’s logic & assumptions, then of what use is that brilliance? a crap translation of the illiad or shakespeare destroys its power…that’s what crap software instantiations do to ‘complex’ mathematical models….i would alos add the obvious: if the imperial model had proved even remotely accurate/consistent, it would not be undergoing this level of (literal) disbelief and scrutiny….i’m sorry but both your criticism and critique is unfail & inaccurate
Here’s another analogy: someone can come up with a brilliant model, that woudl generate highly accurate results— but if he keeps making sign errors in his algebra, the results will be garbage. In fact, if he just goes over his algebra once, he probably *will* make sign errors (I always do), but experienced modellers know they have to check their work. In fact, many an amazing result turns out to be an algebra error on second look and disappears.
Another person who doesn’t read carefully enough. I was responding to the person who claimed, incorrectly, that this is not how stochastic simulation works.
Furthermore, he doesn’t need an army of professional programmers working for him, but he does need people with professional programming skills who can adhere to standard practices, This guys model has been the motivation for shutting down the entire UK economy.
Who cares what you’ve seen engineers hack together? The complaint is not about aesthetics, it’s about correctness, reproducibility, transparency. If your model stays in your research group then who cares? But if it’s used for something this important you don’t get to say ‘I’d like to see a programmer do MY job’.
You cannot seperate the model in this case from the implementation of it. Besides all that, you’re still wrong. If you read the issue tracker, they thought the issue would be resolved by running it on a CPU, but it wasn’t. That’s why the reviewer pointed out they don’t understand how their code is behaving.
If you are running under Windows, then I question whether ANYONE understands what their code is doing in detail. Even with ‘Hello World!”…
But your “Hello World” program still runs the same way every time you run it. If you change your program and it stops working then you know the last change broke it – it isn’t a problem with Windows.
In the case of this model, give the same set of inputs and the same random number seed, it should output the same results every time the program is run. But in this case, the Ferguson model generates different results indicating that the internals of the model is broken. They should be fixing this problem before using the model to get results.
In addition, they should have used data from an epidemic from the past as a reference to test the logic of the model. If the inputs are close to real values, the output should be comparable with a known epidemic case and this would verify the logic.
The next step is to try to model the current epidemic.
Just creating a buggy model and then feeding in guesses will only give stupid answers as we have seen.
dr_t,
I am a programmer, mathematician, and mathematical biologist. Regardless of whether Neil Ferguson’s program bears any relation to reality or not, I just want you to know that your tone in your messages is very rude and is totally unacceptable. Your arrogance completely undermines your credibility. Put your face mask on and stop talking.
I am ever so sorry that I have caused you offence. I very much hope you have not melted as a result.
The proof is in the reality pudding as they say.
Here are the results of Professor Ferguson’s previous modelling efforts.
Bird Flu “200m globally” – Actual 282
Swine flu “65,000 UK” – Actual 457
Mad Cow “50-50,000 UK” – Actual 177
You do protest a bit much.
They actually say, the proof of the pudding is in the eating ….
“I really don’t care about Ferguson’s coding style. He is not a professional programmer, he is an epidemiological mathematical modeler. How well he does this job is the relevant question, not his coding skills. ”
This is a strange defence. If the code doesn’t implement the model correctly then his coding skills, more specifically his software engineering skills, is highly relevant. If the code hasn’t been validated through rigorous testing and contains bugs then it’s worse than having no model at all.
I’m pretty sure his software engineering skills are non-existent, because he’s not a software engineer. I do understand that’s a difficult concept to grasp.
His approach to programming is the reason why the concept of Software Engineering was introduced.
There is no excuse for untidy, unstructured code that is difficult for others (and, after the passage of time, yourself) to follow..
Just describe what it is you are going do, in a logical sequence of steps, and then code accordingly.
There, Software Engineering in a nutshell.
Just because spaghetti code exists doesn’t mean it’s the norm in a professional development environment. The bottom line is that if the recommendations from a computer program are going to be used to make decisions that significantly affect the daily lives of millions of people, the friggen program absolutely needs to be as solid as possible, which includes frequent code review, proper documentation, and in-depth testing. Then, it needs to be shared for peer review.
“in a professional development environment”. And I am sure when the epidemic struck, he had 6 months to get his team of 100 professional software engineers to refactor that code he had lying in his drawer so it was brought to modern day standards, write automated test rigs, run regression tests and submit the code for certification before using it in anger. And while his minions did all this work, he sat at home in his pyjamas, blogging and sniping at people at the coal face trying to deal with the outbreak.
@dr_t How silly. He and his team should have been doing that along. That’s the responsible process. The irresponsible, and apparently expected process, is to take the spaghetti approach and rely on the rest of academic ‘experts’ to defend them if question arose. Fail.
Precisely, lying in his desk drawer for years. No attempt to bring it up to the standards of today (or 13/30 years ago) it would appear. Just hobby coding, then?
Documentation, Automated Test Rigs, Regression Tests, Certified Code during those years would have been icing on the cake.
I disagree. A stochastic model is simply a deterministic model that has inputs that are generated randomly.
For example, let’s say I run a stochastic simulation of a random walk. I build a deterministic model that says if my input number is less than 0.5 then take a step left, else take a step right. I then use a random number generator that gives a me a number between 0 and 1. If I generate 5 numbers and they are all less than 0.5, then the person should have taken 5 steps left. If my output says they are anywhere else other than 5 steps left, then my model is broken and running it multiple times and averaging it doesn’t fix the issue.
For example, let’s say my model actually says take a step left if the number is less than 0.5 AND if Neil Ferguson is horny. Unless Neil is horny every second, then my output will be wrong. If Neil is horny only 1/2 of the time, then the random walk will be too far right. Averaging the outputs will not fix that error.
And I work in insurance modeling. The comment about insurance modeling is a bit too kind! Better than what Neil is giving us though.
I don’t think you understood the bug she was saying was a disaster (the only one she really made a deal of … though in my long comment above, I deal with all her points). The error is not systematic. It is simply that saved state incorrectly codes a seed, so that using your random walk example, If you ran it once you obviously started with a seed that generates your vector of random numbers. If you now run it again with the same seed, you will get the same answer. Lose that seed, then running it gives a different answer, but a perfectly legitimate random walk. Since what you are interested in is the average of a large number of walks, then the result is the same whether there was a bug in the seed saving or not…..becuse in both cases what you need (and get) is a large number of independent runs. The seed saving issue does not compromise the independence of the runs…so no problem!
Correct, if you need randomness, you run such a model by providing a controlled range of random *inputs*. You don’t build randomness *into* the code *unless* (i) you report them for each run, and (ii) you can override them as inputs for testing, validation, QA and reproducibility.
Neal,
Bob’s response to your comment is right. Your comment shows that you do not understand how programming works. Also, given exactly the same inputs, I am afraid that you would expect exactly the same ouputs, even in biology.
In biological systems, there are so many variables that one can never know all the “input” values. If you measure only a few things in an experiment and get “random” results, this doesn’t prove that life is inherently random. It just shows that you haven’t measured everything.
At a quantum level, there may be an inherent randomess in some processes, but not for computer programs, and not on the macroscale of life.
Your simulation has to produce identical results with an identical seed, otherwise it would be impossible to test for correct output. You can use random numbers as the seeds, and run many simulations to model reality, and then see what you get. But if two runs using the same seed produce different output, that’s not a good sign.
They don’t. There was a problem with saving the seed, so in fact when you thought you were using the same seed, in fact you were not. For the rest, then running many runs gives the same answer, whether you succesfully saved state between runs or not.
You are confusing your model with code.
Computer code idempotent and deterministic. That is for a given set of inputs it will produce the same results. If this were not so then computers would be pretty useless. Mathematically this makes sense as a computer processes binary data with a limited range of mathematical and boolean operators and branch routines.
Now you want your model to be non deterministic. You do that by introducing some randomness but that has to be under your control not via some bug in the code or race hazard or timing event between the thread scheduler or some such. You want to be able to actually test the model under controlled circumstances and this clearly wasn’t possible with the code Fergusson wrote.
Getting randomness in computer systems is actually pretty hard and an area of study in itself.
The critic of Fergusson’s code appears to be valid. I don’t think you read it properly or you didn’t understand what was said. You may be a biologist but you ain’t no computer scientist.
If what you’re saying were true, hardware random number generators, such as those based on thermal noise or quantum phenomena, would be pretty useless.
I have not read the code, but earthflattener apparently has. The non-determinism (which – we are told – was not present in the original code, and was fixed shortly after being discovered in the refactored code, and since the refactored code is new, was thus very short lived, and was presumably not present in the code Ferguson actually used, which is his original code) was due to some garbling of the seed used to restart the random sequence, which just means that a different pseudo-random sequence was used in his stochastic model. This affects reproducibility but is immaterial to the correctness of the stochastic model. You would get the same effect if you used a hardware random number generator, which is superior as it gives a more truly random sequence. If your criticism were correct, that implementation would be incorrect too.
Yes, Ferguson is no computer scientist. This is precisely the point – which you seem to be missing.
If someone presents a substantive analysis of his model which shows it is substantively wrong, or substantively right, or innovative, or not innovative, etc., that will be an interesting read. So far, all I’ve seen is a bunch of haughty software engineers sneering at and picking nits in the quality of Ferguson’s programming work from the perspective of modern software development, assuming a professional software development environment, assuming he is a professional software developer (and nothing else) who has other professional software developers, resources and time at his disposal to refactor his code, document it, devise and run automated regression tests, and apply for CE and UL marks for his product before launching a formal release.
The reality is, he is an academic, he is not a software engineer, he’s not likely to have a great many resources at his disposal, he was responding to a crisis, and he had a very limited amount of time. He just grabbed whatever he had available.
And, he is an epidemiological modeler, not a software engineer, so this is the context and the yardstick by which his work should be assessed.
Somehow it looks to me like all these armchair warriors, if it were 1939 today, would be sneering at the engineers who built spitfires to defend Britain from the Germans because they didn’t get their CE markings, because they didn’t document their designs properly, and because they didn’t use the proper software tools, before letting their planes take off.
I am a lay person who does not understand computer modelling….but for such huge decisions to be made without adequate peer review of the data is shocking.
Code should always be unit tested by someone other than the developer. Rule number one.
Disagree. Unit testing is the job of the developer. All other forms of post-unit testing are carried out by testers/reviewers.
Many thanks indeed for putting in the time to review the code and to write your informative review.
Thanks for this article – I wrote C code solidly for 5 years – and still do bits and bobs. It does not surprise me one bit – because I knew this was a scam more or less from the “get go” – due to the investigation I did into the “Swine Flu” affair back in 2009. It’s not about public health – it’s about control – and selling pharmaceuticals (tamiflu and vaccines in 2009 and vaccines now). See my report at https://cvpandemicinvestigation.com/ if interested.
So the problem is one of computational mathematics, rather than that of software development, which most programmers do. Thankfully i have some learning in computational mathematics. Unfortunately, the vast vast majority of computational mathematicians are programming amateurs. Maths is the end goal, and programming is just a means to get there. As a result, their software is crap. Ergo, most of the critique is spot on. HOWEVER… the excuses imperial gave are valid. Stochastic mathematics does not require exact and reproducible results. Its aim is to simulate trends and patterns. Therefore the key thing to critique is the model. The author has failed to do this, so the entire article is bluster. The author’s background is in database programming, not mathematics, so i wouldn’t be surprised if the subject was outside her area of expertise.
While Stochastic mathematics may not require reproducible results, software which simulates such models does (in order to prove that the software works as expected and that bugs are not introduced). This is why in modelling programs the stochastic model is given a seed to initiate the randomness of the model.
During development we can use the same seeds repeatedly to ensure we haven’t ‘broken’ the model by introducing bugs which cause unexpected outputs.
For production use we use as many seeds as desired to repeatedly run the model (introducing the the required level of randomness) before averaging results.
This should be a national scandal but the media will probably make little, if anything, of it. We live in a world of staggering absurdity that the govt has been consulting Ferguson on pandemic issues given his poor track record of predictions and inadequate software engineering practices.
On a personal level I suggest that your personal agenda shines through this absolute pile. You should be ashamed of yourself. But hey, you designed a company’s database product, so you must know what you’re talking about. Wow, a complex system has flaws and bugs. O.M.G.
Embarrassing. I almost pity you.
thanks for sharing its awesome one keep posting
thanks for all amazing article Appreciation to my father who informed me concerning this web site, this webpage is in fact amazing.
Good article.
What’s troubling about this is nothing that came from this code could possibly have been peer reviewed in a meaningful way.
The code should have been released and written with the right abstractions so the bits of interest to epidemiologists are hundreds of lines of high level Python rather than thousands of lines of C/C++. Then other academics could have tinkered and built on it.
Agree that right abstractions are important here. The functions in the code are too long, they should be broken and unit tested.
Don’t agree that Python is the right language since it isn’t statically typed. For serious code we should be using serious languages, like Haskell, Agda, F#, Scala, etc.
I was expecting this model to be bad but this is an order of magnitude (at least) worse than I’d expected.
And the team has then developed a culture dangerously tolerant of mistakes. For instance, if the model isn’t deterministic giving consistently the same answer for the same inputs, I think it’s wrong or perverse to average the outputs over multiple runs because you don’t know whether your averaging outputs resulting from your methodology or from bugs in the program. It’s bizarre.
As soon as they started getting or being told about non-deterministic results they should have put it on hold until the issue was eliminated.
All this reflects very badly on whoever was funding them; on the journals publishing their results and on the peer review process (if any) in the field of epidemiology and mathematical biology.
As a remark in passing, I suspect that similar critiques could be made of the climate change models if anyone could get them into the public domain.
Similar critiques have been made. Remember the East Anglia stuff?
I agree with you. The peer review process should involve scrutiny of publically available computer code, especially when the work is funded by the tax payer.
Much of the “science” that is published by journals is nonsense. Ionnadis wrote an interesting paper in 2005 called “Why Most Published Research Findings Are False”.
I think we can honestly say that any papers based on this steaming pile of code should be withdrawn because the output of it cannot mean anything.
I’m glad to see there has been some critical analysis of the code behind the modelling. I am a hydraulic modeller myself and ultimately your model is only as good as the data you use and the assumptions you make (garbage in garbage out). However I fundamentally disagree with the author’s final point, that modelling is best undertaken by the insurance sector, a sector which has dubious interests at best. At least in my sector, the quality of modelling undertaken by insurers is largely considered to be a ‘black box’, and it times has thrown up interesting results. All the more interesting when you consider that insurers stand to make a huge profit off the results of the modelling. To go the same way with epidemiology would be a mistake in my opinion.
Hi there,
Have you taken a look at this repo as well: https://github.com/ImperialCollegeLondon/covid19model/
They seem to produce a lot of junk code.
He, had an excuse, he was too busy shagging.
I believe he deliberately got caught to give him a way to jump ship before this came out.
“Note the phrasing here – Imperial know their code has such bugs”
You quote Matthew Gretton-Dann but isn’t he just a software engineer working on this refactoring, not someone from Imperial ?
@Sue is there any way to reach you via email? There’s a few points I’d like to discuss.
Publicly I’d like to point out that it’s not actually “Imperial College” staff that was replying to concerns raised in those GitHub tickets, but in many cases it’s one Matthew Gretton-Dann.
Mr Gretton-Dann is not a faculty member of Imperial College. In fact, he is an employee of GitHub, which in turn is owned 100% by Microsoft. He joined GitHub/Microsoft late last year.
Neil Ferguson pointed out on March 22nd that GitHub/Microsoft would be taking over salvaging his code:
https://twitter.com/neil_ferguson/status/1241835456947519492
Isn’t this curious? GitHub or Microsoft wouldn’t be an obvious partner for such an undertaking, would they?
Wouldn’t it be mighty interesting to find out why GitHub, which is home to hundreds of very experienced software engineers who’d immediately realize that Ferguson’s code does not actually do what it claims it does, why GitHub would invest resources to give more credibility to this project? Or why Microsoft would?
Github is just a repo-hosting company.
Thanks for that Phil.
I’m a software developer and this report rang true to me on so many levels.
With regression testing you CANNOT write tests to retrospectively fit the model because you always assume the correct pathway is a successful result and mark it as correct. This is flawed logic!
Proper test driven development requires that you write an assert to say something is expected then write the code required to acheive it. Then when you run all the asserts in sequence you get a thorough test of the pathways and any errors that have crept in during further development are highlighted, allowing you to fix them before release, and at the end you have a very robust piece of software, where you can remove the assert tests and release as ‘tested’.
In my 30 year career, I’ve never released a piece of code that didn’t come up to my own personal quality standard. I’m not a scientist, but you would expect that the amateur coders involved would have a certain sense of pride in getting these ‘models’ functioning correctly to begin with. 15 year old code and “we didn’t have time to fix it” is NOT an argument when the model is used to predict real life/death outcomes.
I was horrified that they were unable to fix the bugs and that the results were different on different cpu’s. Their argument of taking a mean average is valid, but valid only over many hundreds if the results are as random as they suggest, I suspect that they ran it no more than 25 times and took that average. It’s shonky to the core. Simply put, I would have been sacked if I had written this software. And any developer worth their salt will tell you the same. Garbage in garbage out. If its not done properly, don’t bother.
Thank you for this report. All very well but what about every other country that applied a lockdown. Surely they didn’t all base theirs on Ferguson and ICL?
Yes they probably did. Neil Ferguson advised the WHO that advised World Leaders on how to respond to a virus threat that apparently Neil Ferguson stated would kill millions.
Considering that most of them went into lockdown while the UK was still overtly acting as though it was in denial while apparently surreptitiously pursuing the herd immunity policy from the start, and before Neil Ferguson ever spoke in public, this has to be the inevitable conclusion.
Fergusson’s estimates were known to rest of the world well before most countries took significant action with the exception of Taiwan and maybe Singapore.
I’m in New Zealand and as a layman I had read about his predictions well before we did anything except halt flights from China (and then it was really only a partial stop)
Wu’s study was the first to say Covid was already everywhere. At the time thousands of Chinese were hospitalised and dying. Curious if anyone here debating seeding in stochastic models has ever played pachinko?
No, the country I’m in did indeed use Ferguson’s report as a justification for the most heavy handed lockdown system in the world.
It was revealed that Drs Faci and Birx used the Imperial College modelling to help persuade President Trump to close the US down.
https://www.nytimes.com/2020/03/16/us/coronavirus-fatality-rate-white-house.html
Hmmm…being the devil’s advocate here…
If Ferguson or ICL had not issued any report, would not our world leaders or media have found something similar to promote their aims? They were already in a panic over the original outbreak in Wuhan.
Are ICL the only people the western world, except Sweden, listen to?
How does a single professor have that much power? So sad.
I was saying something similar the other day – before he resigned. Ferguson has seemingly put himself in the position of Robert Oppenheimer: infamous as the inventor of a ‘device’ that may kill millions.
Possibly even worse than that: he has emphasised his own influence over the politicians in various interviews rather than just being happy to remain a scientific adviser.
Before Staat-gate he was already known in the US as well as over here, but now everyone knows about him.
If the government need someone to blame, he’s already volunteered.
There’s a slim chance history will (mistakenly) record him as a hero, but I wouldn’t bet on it.
But good link, thank you!
That’s a broad brush. “A lockdown” in the UK is different from “a lockdown” in Sweden, or Denmark, or Argentina, or India.
The universal Source of Truth for many countries in March indeed was not the fraudulent Imperial College study but the fraudulent map from Johns Hopkins which, as it happens, did not track confirmed cases of COVID-19 in real-time but actually uses a Python application to extrapolate and exaggerate public data to instill fear that the virus were spreading faster than it actually did.
Slightly different version of fraud.
A hysterical media reporting every individual case of coronavirus put huge pressure on people in positions of authority. Then, once one authority started imposing restrictions on travel, it became a competition between authorities to impose stronger and stronger restrictions. This process culminated in governments placing their countries under house arrest, which is the most absurd overreaction possible.
Imagine if every individual case of flu was hysterically reported in detail by the media…
Who actually is the author? They don’t appear to exist on the internet which seems very suspicious to me.
Sue Denim = Pseu Donym
Very well explained & agree, insurers should have funds to model as sadly, universities are not accountable & should not receive public funding if they are not transparent as most people are smart & using jargon to fool them will not work, eventually it will come out.
But insurers would only model things that affected their bottom line. Actuarial science is well and good and orders of magnitude better than economic modelling upon which millions upon millions have been flayed byexcellent programming but crap modelling. No excuses, but a reminder that even with lockdown Italy and Spain had massively swamped ICUs and people dying at home; Brazil has people dying in the streets. Hardly flu like.
Ferguson’s model used an infection fatality rate (IFR) of 0.9%[1], whilst more recent data suggests that the IFR should be within the range 0.1%–0.41%[2]. According to my attempt at a cost–benefit analysis[3], this made the difference between justifying a lockdown and not justifying a lockdown.
[1] https://www.imperial.ac.uk/media/imperial-college/medicine/sph/ide/gida-fellowships/Imperial-College-COVID19-NPI-modelling-16-03-2020.pdf
[2] https://www.cebm.net/covid-19/global-covid-19-case-fatality-rates/
[3] https://medium.com/@martinvsewell/the-coronavirus-lockdown-in-the-united-kingdom-a-cost-benefit-analysis-acd19f635dae
Interesting method Martin. I’m not sure it’s the most compelling way to do it though. Reducing everything to £ by putting a value of £221k on a life saved by lockdown and comparing that against an estimated economic loss of £2.4bn a day caused by lockdown is fair enough but it’s fraught with difficulties and puts you in the ‘all you care about is £’ camp. That’s not true but …I think it’s better to stick to assessing lockdown in terms of death, suffering and hardship. Suffering and hardship are difficult to quantify, so you could just stick with deaths to make the case.
First, I would argue that lockdown doesn’t reduce Covid deaths. Assuming there won’t be a life-saving vaccine any time soon, or any life saving treatment, and that the NHS will have the capacity to cope, then the number of Covid deaths will be the same with or without lockdown. Lockdown just affects when the deaths occur. However, lockdown causes non-Covid deaths, for all the reasons we are familiar with. We don’t know how many but you don’t have to get into arguments about the number – it’s more than 1. QED.
I can’t see how it’s possible to argue with that. You don’t need to be a mathematical biologist to work it out, it’s not a matter of judgement. It’s just bleedin’ obvious. So much so that you could reasonably argue that the Government has wilfully and knowingly inflicted harm on its citizens by continuing its policy of lockdown, once the danger of the NHS being overwhelmed had passed.
Every hour that is passing – because the NHS is not going to be or not in any immediate danger of being overwhelmed – the non-Covid deaths are adding up. This is a fact obvious to every doctor and nurse in the NHS. But the hospitals are not able or willing to re-organise the pathways to increase the number of non-Covid cases they need to treat because there is an edit out from NHS England (I presume) to keep certain numbers of ITU beds ready (different number for each Trust) for the next ‘surge’ as predicted by the modellers. Who the latter experts are and where they are based is a mystery. Btw, my hospital has been earned of a t least 6 different surges since the end of February and not a single one of them has turned out to be correct.
Wow this is appalling. This to me is far, far worse than Ferguson breaking the rules he’s endorsed, bad as that is. I want this all over the papers. Why isn’t it????
They shouldn’t even be attempting to repair this code. They should be starting from the *model* *itself* and implementing *it*, not trying to repair a broken implementation.
Which boils down to the fundamental problem with all this. It’s not an issue of “computer” modelling, it’s an issue of modelling. What’s the actual model they are using? Not what’s the flawed implementation of the model?
I think code used to drive government policy should, at least, be testable. This code is clearly not. I’m not demanding beautiful or even clean code (as a top programmer would write). I just expect a minimal level of testability. I expect the code should be designed with testability in mind. All top programmers agree that testability is a critical requirement for well engineered code, as used in finance, medicine and critical engineering applications. All code upon which lives, infrastructure and money depends. For example: with testable code, all random numbers must be entered as parameters so that specific functions will be deterministic. Without testability once cannot know the code is deterministic. Otherwise how can one know the code models what the designer thought s/he was doing? My own experience of code reviewing shows only about 5% of coders write tests and testable code! So I’m not surprised that a non-professional like Ferguson would write the kind of code described by Sue.
PS: I’ve not read Ferguson’s code so I’m taking this on trust from Sue.
I notice that the author didn’t talk about experience with stochastic models, and I’m afraid I think it shows. I would agree with dr_t’s comment but choose to make some additional comments on the specific points that she makes.
First, on the non-determinism of issue 116. The author of the criticism doesn’t appear to understand that this is an ensemble method. You do not take the result of a single run as being ‘The Result’. Rather you run the result thousands of times and take some statistics from the ensemble of results. The mean, for example, might provide the best estimate of number of deaths for a given input set of parameters. The variance of the ensemble gives a measure of the uncertainty. The discussion on issue 116 clearly shows that what was wrong was that there was an issue with the storage of a seed, meaning that if the program is invoked in the manner discussed a single result is not duplicated. However, if the bug is only at the level of the seed as both the author of issue 116 and the respondent agree, then this does not matter for the intended usage of the method. It means that you cannot replicate individual instances of the ensemble – but the ensemble average is not affected. It is rather like you are trying to calculate the probability of heads of a weighted coin. You toss the coin 1000 times and calculate the probability of heads as #heads/1000. Now, suppose somehow you can’t read you writing for what you noted for the 672th coin toss. What do you do? You can’t rerun the 672th coin toss. Does this invalidate your estimate of the mean. No, just run it one more time. That is the seed issue.
Now I guess I understand why the author is so upset at this. She is a programmer, but I suspect not a mathematical modeller. Certainly, it seems that she doesn’t ‘get’ the usage of such a model. But what would they fix such a bug in that case. Well why not? I would fix it if it were my code, though I know it doesn’t affect results. For my code, it ‘matters’ to fix such bug as some end users will try to interpret individual runs as though they were meaningful. They like the look of the results or whatever. No matter how many times you tell them that it is one outcome, they want to believe ‘this realization’, so yes, we ensure that individual runs are reproducible. But that is because we sell the code. IF they want to slightly misuse it, then so be it. In the case of this code, it is not misused by it’s operators. They know it is a stochastic algorithm and use it correctly. An individual outcome is not sacrosanct.
Regarding issue 30, this is even less important to the users of this code. So it doesn’t run correctly on a particular supercomputer, but as the authors of the issue said themselves, it runs fine on their laptops. Again, if you are selling the code, perhaps you need to make sure it is machine independent – even on Crays! But the small group of users of this software know full well what systems it can run on and which ones it can’t. Not a issue for the problem in hand.
How about the hotels issue? Well, HotelPlaceType is mentioned 16 times in the method. So the author has decided that it being excluded from one loop is necessarily a problem? Really? Get a grip! The comment most certainty does not lead us to discussion of R0.
However, since we are at R0, then first let’s dispense with the reference she made to the Google Machine Learning paper. That talks about feedback. Now it is not clear how R0 is treated in the code – at least not without a lot more reading. The author of the criticism doesn’t shed any light on it either, just trying to swat it with the Google paper. Well, it is a stronger fly then that! There is a base R0 that is a population parameter. That is the one that is used as input. It is not output from the same code – it is based on population observations (admittedly still a bit ill-defined, although latest work seems to think it is HIGHER then that used by Ferguson, weakening even further the herd immunity strategy (https://wwwnc.cdc.gov/eid/article/26/7/20-0282_article). The point is, R0 has an objective meaning at the outset of the disease prior to any strategy put in place. It may depend locally on population density etc., but it has a global population meaning and the software should account for local variation and update it based on behavioral change.
Suggesting all academic funding for epidemiology be defunded? The words of a typical 2.2 student. I’m not saying that insurance companies couldn’t do a good job, though this kind of problem is not the sort of one that any insurance company actually insures against. Have you tried to make a claim for disruption to your business? The code has no doubt been runs tens of thousands of times by experts. She provides no evidence of any bug of substance, It is fit for purpose for those experts. It is not, and does not have to be some out of the box database or email server that the author has experience of. The problems she works on are extremely important, but mind numbingly routine. Single usage scientific code does not fit her paradigm.
finally, let’s look at a wee bit of empirical evidence. The death rate in New York is rapidly approaching 1500 deaths per million. It is not clear how many people have been exposed to the virus in NY, but probably safe to say not more than 25%. Left to it’s own devices (i.e. with no interventions), it is pretty much certain that this rate would be achieved elsewhere. So, scaling to the population of the US as a whole, this would lead to near enough 500,000 deaths with 25% exposure. As this is a new virus, there is nothing inherent to hold it at 25%, so a factor of 3 times more deaths is possible under worst case scenario. That leads to 1.5 million deaths – so the worst case scenario of Ferguson’s model is not so far fetched after all.
Entropy in a model is not a problem, but you must be able to reproduce an individual case in order to ensure developments are not adversely affecting agreed-upon “good” outcomes. Otherwise you would have to run however many ensembles and take statistics for each incremental change, or else risk adversely affecting your model in unknowable ways. Even then it would not be deterministic, and still liable to mishap.
The entropy must be controlled. This is why pseudo-random number generators are used. If all stochastic decisions are sourced from a unified seed, it is possible to prove robustness of the code whilst achieving the appropriate level of randomness. If they are all individually random, you have no control over your own model. Given there are often significant feedback paths in such models, the end result could end up deviating significantly from its design, quite apart from any expectation of the author (therefore, in an uncontrolled manner).
Uncertainty in a model is derived from running multiple times _with different documented seeds_ and from tweaking the input parameters according to their uncertainty, not simply from running in an uncontrolled manner and having faith the outcome is trustworthy, which is essentially what the Prof seems to be doing.
The questions of how R0 is treated in the code is really the wrong question. The question is how is it treated in the _model_ this codes implements. It seems there is no such well-defined model, he has just hacked on old code as fast as he could to get a paper out.
The hotels issue is meaningful because it appears to be random, and may be a symptom of less obvious but more devastatingly random coding decisions elsewhere in the code. There is enough doubt raised by that one observation to cast reasonable suspicion on the rest of the code. Of course, there may be a good explanation (assumption that hotels are closed?), but this is guesswork.
Whether or not the code has been run by experts is irrelevant since it was supposedly written by one. If he is emblematic of the current state of experts, your objection objects to itself. In any case, if the code is so old and undocumented that even the author cannot understand it, how on Earth are other experts supposed to vouch for its robustness and accuracy? Yeah, there is no way.
Regarding code portability, it absolutely matters that it does not run similarly across different platforms, if that is the case. It proves it is neither robust nor reliable. The core of the code is supposedly 13 years old – or do you suppose the good Prof is using the same machine as in 2007?
I gave a long reply John, but it has gone. Nor sure if the mods are super strict. It’s not the only one that went missing. Maybe it’s just very buggy
The gist of my answer is that the bug was a regression. The initial code did not appear to have it. The only real reason for the reproduction of individual case is, as you said, to ensure that the code is correct originally, but it can be run safely afterwards without. Indeed, there are other tests which dispense with the need for reputability of individual cases , such as ensuring that the ensemble is ergodic – i.e. not changing over time. In a department such as Imperial, with its strong math focus, I think it is a safe bet that one of the PhD students has tested it – it’s a standard question for stochastic models. Finally, portability is really not an issue, except perhaps as you said for updating the in-house system on which it is run. Two mitigations, the portability issue was to do with the compiler. I’m guessing they have not changed compiler recently. Secondly, PhD students….if you have done a scientific PhD, you know that you will not rely on some 3rd party code without proper testing. The origins of the code are 30 year old – not 13. It is a very safe bet to say that it is as ok as any other out there
“finally, let’s look at a wee bit of empirical evidence. The death rate in New York is rapidly approaching 1500 deaths per million. It is not clear how many people have been exposed to the virus in NY, but probably safe to say not more than 25%. Left to it’s own devices (i.e. with no interventions), it is pretty much certain that this rate would be achieved elsewhere. ”
Nice piece of selection and extrapolation from an outlier there.
Hi Mark, it is not an extrapolation from an outlier, though it is a bit of handwaving. Not an extrapolation from an outlier because it is precisely what you expect to happen with a virus that takes hold. An outlier is something inconsistent with your idea/model. This is perfectly consistent. It also is a very big sample – the biggest one actually. What it is doing is allowing us to see the ‘future’ of other areas if they were to get the same level of infection.
A virus is a virus. It has no politics. It just spreads. Left alone, this would spread and without a good reason to see otherwise then the very large sample that is New York is an indicator of what might happen elsewhere. There may be reasons why it won’t happen but they need to be established. , New York is the elephant in the room.
‘…..it is pretty much certain that this rate would be achieved elsewhere’
Not if you look at Hong Kong, population not a great deal less, or less dense, than NY.
Deaths in high density HK? Four, precisely. Differences? Hygiene, health/age and climate, factors similarly affecting the impact of other coronaviruses/rhinoviruses (the common cold) that infect humans.
Hmmm……
This discussion is, in any case, academic, the subject for an independent public enquiry, which must be sooner rather than later for this government to retain any credibility whatsoever.
Public opinion can turn, will turn, very quickly.
Furthermore Covid 19 figures for New York are no basis for any definitive statements whatsoever since the categorisation of Covid 19 mortality is as broad in New York as it is this country (UK 86% of deaths in March from Covid 19, Italy 12%………) and doubtless for the same reason:
https://www.youtube.com/watch?v=g5f_6ltv7oI
The public enquiry must also look very closely at why models were used to support major policy decisions when good, extensive, and reliable data was clearly not available to input into them so outputs could only be the equivalent of a wild shot in the dark.
Which is more similar to NY? Hong Kong or the rest of the United States?
I have opened a github issue calling for the retraction of studies based on this codebase: https://github.com/mrc-ide/covid-sim/issues/165
You are retired right? Why don’t you write a better simulation and publish it so I can review it. What else do you have to do?
An excellent, professional analysis. Depressing reading to be honest. Time some robust modelling was offered from an open source. Would love to see those results.
The job of an Academic is to think out the concepts and specify the ideas to be used in modelling any topic.
Then it’s the job of professional software engineers to implement those ideas in a reliable and rigourous way.
In the same way, physicists produce concepts that engineers use to create things. Einstein worked on matter-energy duality, but you didn’t have him on the bomb-making team….
It looks as if this model is not fit for purpose.
If this is so, why are we blaming Ferguson? Surely the correct people to blame are the faceless Civil Servants who accpted this heap of rubbish and let it determine their policy decisions without any question?…
Civil servants, faceless or otherwise, do not determine policy. Policy is determined by politicians.
I have been working with Cfd models and supercomputers for a while and used Monte Carlo simulations too. I can tell that validation is the first and biggest issue all the time. Everyone can write a code but the difficult part is the validation when it shows its preciseness.
The way how Imperial London promoted their software without proper validation is totally unacceptable for me.
If you use Monte Carlo (or better, if you write Monte Carlo), then you know that ‘losing’ a seed changes nothing. The result depends on the ensemble. The ensemble was not affected by the seed storage bug.
Anything of this nature that has the Microsoft people assisting must already raise serious alarm. This is a disaster and heads should roll!
So who, what, why, when, where and how is our Scientific community getting this so fundamentally wrong?! Climate Change models would also seem to suffer the same implausible results to the reality of the situation! I’m not a Scientist/Mathematician, but a 52yr old businessman, who’s every sinus of thought process understood that you can not close your economy on such unpredictable, unreliable Science, without causing ‘known’ catastrophic consequences! Add to the mix a Political Class that fundamentally didn’t understand it either, and together, it’s effected millions of people unjustifiable so….
It’s an emotive subject, I consider any death to be loss to my nation and society, but this needed someone to rise above the Scientific Community and question their so called results!
https://youtu.be/EYPapE-3FRw
I have long a sceptic of Ferguson’s modelling and therefore to read a critique by an expert is fascinating. But just for the sake of objectivity, and given this forum, can I ask whether Sue Denim approached this task already, a sceptic? It’s not crucial but it makes a difference to how I read the piece.
Hard to believe,
govt software should be maybe open source,
if we can do it with encryption, there’s no reason
we can’t do it with anything else.
Conclusion makes perfect sense. The role of liabilites and loss adjustment is based on all available data that would allow a sifting of unusual or erroneous data before offering realistic results. Of course a degree iin hindsight is more valuable than rocking horse manure at the moment
Were they trying to tweak their ‘models’ to give them the results they wanted
like the Univ of E anglia climate models.
That’s taken for granted. The paper even telegraphs this in advance. Before even starting, Ferguson tells us that Covid-19 is the same level of threat as Spanish Flu. There are no references for this statement. He doesn’t say “I was astounded to find may model producing a similar level of threat as Spanish Flu…”. It was his assumption on the way in.
Thank you for your expertise and dedication to accurate coding.
In France since late last year, here’s a paper to show. It’s NOT the ‘Chinese virus’. https://www.sciencedirect.com/science/article/pii/S1567134820301829?via%3Dihub
“Academics write code on the back of fag packets, professional software development develops professional software”
Meh. Grades of software development exist. That’s obvious, and not the point.
Instead, the points are :
1) Academic dev code is useful beyond proper professional software only when the output (in a figurative, intellectual, insight sort of sense) adds something important.
What does this particular academic dev model add, beyond what would have been available from getting a professional team to do all this?
2) Code development grades exist. See rubric above. However – did anyone in Govt pretend that this hack code was not hack code? Were we misled in this way, or poorly advised?
Clearly using programs with so many bugs is completely unethical & totally misleading.
However, I’m more concerned the statisticians and modellers turn their attention to the ridiculous inter-county comparisons of deaths from COVID and gave proper clarity & scrutiny eg population density, recording of cause of death etc.
Hey it’s even easier to debunk Ferguson’s model, check out “Peerless Reads” on YouTube.
The guy has been graphing the data and while there is exponential growth the rate of increase of growth rate is also always falling.
He’s compared legitimate data from different countries to the modified and fabricated, he’s able to show where different areas are inflating the number of cases to justify lockdown. He also showed, a couple of days after UK lockdown, how Ferguson’s projection was behaving like no other virus ever has or would.
All the complexity in the IC code is in simulating the distribution of the UK population and their interactions. The basic maths behind epidemics is relatively simple and was discovered in the 1920s. You get the same result as Fergusson by simply using a simple Python program and his values of R0 scaled to 65 million people.
http://clivebest.com/blog/?p=9419
What he has not taken into account is a natural decay in R with time as the super-spreaders with large international social networks get infected first. This reduces the contact rate between people as more normal social networks take over. Sweden is exploiting this and they will reach Herd Immunity in a couple of weeks time.
http://clivebest.com/blog/?p=9498
A lot of points in just a short paragraph. However, it is not true that the SEIR models ‘give the same results’. You can choose parameters that give the same final outcome – yeah, but they don’t have spatial context nor do they account for evolution of R0 properly. Your one parameter hack of the simple model is entirely heuristic and your choice of a parameter for the reduction in R0 even more so. Not saying it is bad, it could be a useful modification to the simple models, but it doesn’t appear to capture the dynamics of spread.
In a sense, anyone can be a super spreader…they just need to take the virus to a fresh community who are not practicing social distancing. Consider the case of the spread into the meat factory in South Dakota. The R0 jumped back to its usual ‘densely populated’ value in that community. Despite the fact that South Dakota in general didn’t have an issue, the spread in the factory was fast and lethal. The main point is that the virus does not spread homgeniously through the community like the ODE solutions suggest – and yours is just a modifier of that classic model. It jumps from community to community and forms clusters. Social distancing keeps the clusters small and manageable.
Fantastic work, Sue.
I used to do point of sale and some financial programming way back so was able to follow your well written report, though I was brought up on Basic rather than Fortran and then C++
This amateurish bit of work has caused massive trouble time and again, in fact at least 20 years.
On this basis perhaps lockdown should be immediately lifted. Covid had covered the country way back to the point that we may already be at herd immunity levels.
My hero: Prof Michael Levitt and glad David King has shown up. Perhaps Boris will ignore Cummins and stop panicking like a headless chicken!
Pointless rubbish. The code is a detailed simulation to give an understanding of the nature of a phenomenon. It is not important to get identical results every time, because the underlying numerical assumptions are not measured in the first place, their is a great deal of uncertainty around each probability used in the model.
The assessment is made by someone used to testing software changes where deterministic behaviour is important to ensure that outputs vary only as expected. This is not the case in actual simulations. Also, the criticism that the the software had not been rewritten for a long time, but amended time and again is unfair. Microsoft had been peddling the original CP/M operating system for many decades until Windows 10 (we are told was written ground up, I may be wrong about this exactly, but not in principle.
I have worked for decades in both scientific computing and financial systems. There is a difference in approach to ‘accuracy of programs’ in both worlds and rightly so. In the scientific world, there is much reuse and adaptation of old code because funding is limited and “accuracy” is not important n the same way. In a banking system, a transaction must behave exactly the same way every time and a whole collection of same set of transactions in an online banking system must behave in exactly the same way. So, testing is done using a logfile of actual transactions which have been anonymised through each phase of software changes. And despite all this, you do end up with errors produced in live operation, because somehow the same transactions arriving in the same sequence behave differently because the operating system with multiple processors does not sequence them in an identical way as before. The environment within the system is never the same because if nothing else, the system clock’s date and time are different.
What is important in these situation is to publish the assumptions and the general nature of the algorithm. And what you are looking for is the changes in predicted outcomes on the basis of changes in assumptions. In monte carlo simulations of physical processes the probabilities are measurable and verifiable. In this simulation, most of the probabilities just are not measurable.
Much of this simulation is about predicting the nature of the phenomenon and possible effects of changes that it is possible to make. It is not about accuracy of numbers at all, it is pointless talking about software being free from bugs. Producing bug free software except in the simplest of cases is still a major challenge despite so much effort that has gone into it.
The article is aimed at lockdown sceptics who have loved it, that’s about it.
Wow, just wow – that is one of the most complacent and ignorant comments about anything that I have ever read, anywhere!
I’m going to hazard a guess that a link to this page is doing the rounds at Imperial, and probably by extension at Imperial’s major funders…
Why? Because there are sensible, thought-out comments critiquing this critique, instead of the usual echo-chamber of agreement?
(David Davis linked this article in his twitter)
Why? Because people are quite sensibly critiquing this critique?
The article was linked on a popular twitter account
No, it was linked from an MP’s twitter account.
And whats wrong with critiquing the critique? Or do sceptics prefer to stay in their own echo chamber of safety?
Proper sceptics don’t operate in an echo chamber. We prefer to keep the evidence under review, and adjust our point of view if the evidence justifies doing so.
The initial predictions from the model were probably reasonable, given the lack of data at the time. Now, we have loads of data, so the model should be back-cast and adjusted if necessary, e.g. with updated IFR, susceptibility, homogeneity etc. Let’s see that done.
“What is important in these situation is to publish the assumptions and the general nature of the algorithm. And what you are looking for is the changes in predicted outcomes on the basis of changes in assumptions.”
From what I have seen of the code so far, many assumptions are baked in, such as ignoring internal flights.
“It is not about accuracy of numbers at all,”
AT ALL? Surely even you, with your generous assessment of the code, must have an upper limit for how much of a factor the end results can be out by and still regard this as in any way useful code or modelling.
I agree with you, but see my & dr_t arguments below where we actually address her concerns more precisely. The final predicted result depends on the ensemble – and the ensemble result is not affected at all by the minor issue that she discusses. Full explanations given below….
Good Domain experts, in this case epidemiologists, make bad system engineers. Good system engineers make bad domain experts. Good data scientists who are expert in data analysis and systems engineering, And can bridge the gap, are few and far between . The results of Multiple post graduates compounding a single problem codebase over decades of varying research requirements can be difficult to explain. But hey, who else has done the research?
I believe most of us knew there was something not quite right about this whole pandemic situation and welcome a transparent public enquiry.
Fascinating and shocking at the same time. I don’t have a problem with taking short cuts or maybe not applying recognised good practice if circumstances justify it AS LONG AS the risks are identified, some attempt has been made to quantify the risks and, most importantly, everybody recognises and acknowledges what is being done and why.
To just continue and try to convince everybody that “every things is fine”, as seems to be the case here, is totally and utterly unacceptable.
These “secret” pieces of work need to be released for peer review.
This discussion to me seems to be part of the problem. Regardless of how this model was put together, it is the WRONG MODEL. We don’t need to know the gross number of deaths. What we need to know is the burden of disease from potential C19 deaths compared to the burden of disease from other causes created by the anti-C19 measures. At the moment, we could be in a situation where the amount of life being lost as a result of routine healthcare being suspended and therefore not spotting cancers, etc etc could exceed the life being lost as a result of C19. And one life is not the same as another, the remaining life of an 85 year old is not the same thing as the remaining life of a 25 year old. It seems to me that this model was used because that’s what was available off the shelf, and it has become the subject of fierce debate on various levels, but the fact is that it is the wrong model and it does not help at all in decision making, e.g. at what point should we ease lockdown. For that you need to estimtate the costs in terms of healthcare outcomes and financial costs in aggregate of a lockdown compared to its benefits.
Poor review of the code and non seq extreme agenda driven conclusions. Reviewer seems not qualified to talk about modelling, but might have done a bit of coding — not the same thing.
Minor update; It looks like the request for the original code you reference has no longer been ignored.
It has been closed. Sigh.
Justin’s issue has lasted 9 hours thus far; whilst it has at least attracted a couple of amusing comments, I imagine he’ll be silenced similarly.
Your background as a database programmer has allowed you to point out immaterial problems with the code, but you show zero knowledge on health/population modeling, statistics, microbiology, virology, epidemiology, health-outcomes modelling, bio-mathematics…
Non-deterministic outputs should be expected as this model is stochastic in nature, and occurs in binary network files which are platform dependent. You should expect different results when re-run with identical inputs. You also failed to add that this ‘issue’ (which forms the basis of most of your critique) has been fixed – https://github.com/mrc-ide/covid-sim/pull/121
And I would much rather epidemiological research stays with peer-reviewed academic sector, rather than a sector designed purely to create profit.
I think a lot of this is reasonable critique, especially around testing, and I am no fan of “Academic” coding standards – but an objective critique without the clear bias would have been nearly as devastating and more widely accepted.
In particular, I appreciate that multiple threads can improve performance, but in a stochastic simulation it’s quite simplistic to assume that you can make full use of parallelisation and retain 100% determinism based on your starting seed across all types of hardware – threads are innately random, after all.
The idea that insurance companies are the only way to look at this kind of problem… riiight, we appear to have strayed well beyond the bounds of objective critique, into politics.
Stick to balanced criticism of the genuinely objectionable aspects of the code and you could have made largely the same point without alienating people who didn’t start with your agenda…
It is not even worth going to so much detail, the truth was there right from the start on TV for all to see.
UK in reality tried to go for “herd immunity” and the UK failed and paid a high price for the risk taken. Two charlatans without being able to tell apart their elbow from other parts of their anatomy in immunology were in charge alongside the PM himself.
They presented a bell shape curve of immunology of which AT LEAST ALL the DATA of the right hand side of the curve was missing!
Johnson was saying that a lot of people were going to die etc etc.. it had been decided: “herd immunity”.
Johnson must come clean and sack the “self proclaimed scientists”.
Nothing was done to help England! Plane after plane from China and Italy landed making the virus pool wider and deeper! No measurements, no prophylaxis nothing!
The only real experts who deserved the money were in Japan, Taiwan etc.
If any university gave the charlatans in chief a degree which even suggests immunology, we should make sure that the civil service never employs anybody else from those universities who would seem to be giving “anybody” a degree in science and they should both be sacked forthwith!
I agree that we must live with the virus until there is a cure, “herd immunity” should not be forced, it should come as a consequence. At least we should take steps to stop the external contribution which started it all in the first place and we did nothing to stop it! We should take steps internally to prevent and to trace.
We need to take precautions to reduce contagion, if the infection rate is manageable then we can afford to work an play, a new normal will be developed with as much care as possible but testing cures and vaccines on a voluntary basis ASAP. Both the way we started off and the way we are going now are both wrong and not sustainable.
Ferguson was wrong on BSE and Foot & Mouth (I am an ex dairy farmer), wrong on SARS and Swine Flu – why did 3 PMs believe him this time. You have shown the flaw in a detailed way. Unfortunately, the media will not pick up on this and whilst I am a Tory supporter and ex politician, the Govt will not admit it has egg on its face.
One did not have to be a modeller to see that the numbers were wrong from the outset. One only has to go back to the ONS figures going back to the 1950s and also having caught Asian Flu, Hong Kong Flu and a nasty one in 1988 to see that the excess deaths were far greater than now including 2014/15 which were 50% greater than Covid 19 deaths.
Why was no risk assessment factored in as has been the case in the past and indeed herd immunity has always worked before and even Ferguson believes it having been caught out and proffered it as his reasoning for flouting the lockdown.
What has this nation turned into – glad I am over 72!!
Spot on – HK Flu in 1968-70 was worse than this (I can’t even remember it happening) but life carried on regardless. No social media crap back then, of course. Amazing to think of the damage done by this fiasco.
The man was a rabid remainer, likely a socialist. I’d say his model worked perfectly.
Good to see you deploying your epidemiological and modelling expertise so effectively. Thank you.
I note in the Model.h that an airport state data structure is created but the comments indicate it is not used as the authors do not consider internal flights to be significant for the UK (as opposed to say, the US).
This seems to be a mistake as surely this would have been a key factor in the early stages of the spread of the virus as serendipity of infected encountering non-infected would matter a lot more, even for just internal flights (which in turn would be servicing people landing in the UK from outside, possibly infected, who then suddenly move hundreds of miles across the country).
They appear to include airports in their ‘places’ data structure but I’m not clear so far whether this means they pay an attention to the continuing influx of potentially infected people from international flights. Anyone else had a deeper look at this?
““Stochastic” is just a scientific-sounding word for “random”.”
Do you know a library called tensorflow? Did you try to produce the same results from a seed?
I don’t have to tell you where it’s being used…
Much like the expose of the HARRY_READ_ME file in the Climategate email leak (not a hack), this sort of article will be ignored and denied by the politicos because it conflicts with their desired narrative.
As a molecular geneticist, what has puzzled me most is the assumption that 100% of the population is capable of being infected. Why assume that everyone has cell surface receptors which the virus can bind to? It makes a huge difference to calculations whether 20% or 50% or 80% are susceptible. It looks as though many or most small children are not. That should ring warning bells.
The USS Theodore Roosevelt, a couple of submarines and a Dutch naval vessel, all with thousands of crew packed together for weeks with coronavirus going through, sleeping in bunks stacked three high – all personnel tested at various points and the figures given are 15-30% infected. Similar story with the Diamond Princess. Only one French case gives a higher figure, about 50%. Perhaps testing errors? None of these “natural experiments” suggested that all adults are susceptible to the virus.
If they cannot catch it, they also cannot make antibodies to it, so current testing plans will not give information about this.
It is a commonplace that there may be a number of genetic variants of a gene, giving rise to small differences in the protein coded for without altering its binding activity for its normal substrate. That is not to say that it would still bind virus, which is not quite the same as its normal substrate. There are other post-transcriptional ways in which changes can be introduced.
Is anyone looking at this? Should be straightforward to identify any of these differences.
Yes, someone is looking at it:
https://www.medrxiv.org/content/10.1101/2020.04.17.20061440v1
I’d happily bet anyone a tenner that susceptibility is around 60% (remainder have prior immunity or not susceptible) and so the epidemic starts to subside after around 20%-30% infected/seropositive.
There are examples of higher rates but these could be “herd immunity overshoot” if they all occurred in one location at the same time. Even in Bergamo I think only 61% seropositive.
Good link!
Now I know who to blame for Gmail and maps
This sounds similar to the problems in climate models. Non deterministic to the point they resort to using “ensemble” models. Ie many runs and averaging. Total dross.
Ferguson’s. As Guido fawkes describes as “lockdown, pants down, steps down Ferguson”
My letter to my MP…
Dear Sir,
Ever since the Imperial College computer model, for which Neil Ferguson was lead until his resignation yesterday, I and many others were sceptical of the output predictions that drive the lockdown policy.
It has recently become clearly apparent that the numbers were so far out, by an order of magnitude, that serious concern over the veracity of the model, and its fitness for driving policy must have been raised at the core of government, principally by those who are our representatives there, i.e. yourself. Whilst it may well be argued that the model numbers ‘could’ have been true ‘if’ the lockdown hadn’t happened, there is no evidence of this whatsoever. The difference in numbers, and the characteristic of the virus incidence rise and fall, can be more rationally explained by either (a) social distancing, that did not require draconian lockdown, or (b) natural progression of respiratory viruses. It is also noteworthy that the virus incidence rise and fall characteristic, as measured including by daily deaths, has been consistent across most countries regardless of each country’s response.
The nail in the coffin for the Imperial model, and Ferguson himself, however, must be yesterday’s publication of an independent code review by a senior and greatly experienced software engineering professional, Sue Denim, as available here: https://dailysceptic.org/code-review-of-fergusons-model/. I do not comment on his extra-marital activities or even his belief that he’s above us, and through self-belief, can violate the lockdown. No, it is his absurdly bad computer code that cannot produce a reliable, repeatable, verifiable or trustable result that is the grave concern.
I have maintained on social media for some time that the whole response to this pandemic has failed to utilise the skill set of one significant and expert section of the population, i.e. PROFESSIONAL ENGINEERS, which in the medical profession are represented by Practitioners. The continual “we’re following the *science*” is laughable to this professional discipline, not least because ‘science’ isn’t a thing or a statement, but a discovery process or tool. ‘Science’ does not state anything, but merely gives us hypotheses to test, and when tested correctly, trustable observed data to analyse and work into the final product, which is what engineers do. These products, which in the political sphere, is policy, have to take on board many other considerations, including side effects and unintended consequences, human safety, economics, practicality including human psychology and ability to produce/execute, and operate with a positive cost-benefit. It is clear that the Imperial model, Ferguson himself, and its acceptance by government to create the lockdown policy were devoid of this. There were no checks and balances – no ENGINEERS!
The code review of the Ferguson model is totally damming, going so far as to state that any and all published papers that cite or rely on this model must be rescinded. If that is the quality status conclusion of this model, then the resulting lockdown policy that relied on it must also be rescinded. The lockdown must end before further economic and health damage is done, which will be far greater than from the virus itself.
Whilst the government opposition is the normal body to hold Government to account, this is the time when ALL MPs, including those of the governing party must do this. A Government’s first duty is to defend its citizens, so Parliament’s first duty must be to ensure the Government does that. This is your duty! I trust you will act.
Yours sincerely
Can you add my comments that many also disagree with you and are very skeptical indeed about Ms Denim’s critique of the the code, focusing as it does on a bug that cannot materially affect the results of the calculations
GIGO, Garbage In, Garbage Out, it already well known, but what we have here is Garbage Into a Garbage System resulting in Garbage² Out.
The government keep saying they are “following the science”.
It has been known for seventeen years that recovered victims of SARS Cov2 viruses DO NOT GET LONG TERM IMMUNITY. It is likely that immunity will only last a few months at best (ask Professor Sarah Gilbert – the Prof leading the vaccine team in Oxford).
So which scientist in SAGE initially advised the government to follow a “herd immunity” strategy for a virus which does not give its recovered victims long term immunity? This smacks of schoolboy science from politicians who did not understand the above issue or who wished to ignore the problem until it became obvious a “herd immunity” strategy would kill many thousands. This dithering and change of strategy caused rampant infection and the delay to lockdown which has now caused the death of over 30,000 people in U.K. and probably wrecked the economy.
SO WHICH SCIENTIST ADVISED THE GOVERNMENT TO FOLLOW A HERD IMMUNITY STRATEGY – OR WAS IT FORCED ON SAGE?
Now we find out the “scientific predictions “ from the epidemiologists was based on unreliable programming. Are we really surprised?
Supply line and military science says in emergencies keep supply lines short and use local resources. The frontline staff don’t get PPE. The PPE delivered isn’t adequate quality – but no one checked before loading it on a plane in Turkey.
Matt Hancock still insists on having a central NHS purchasing and distribution strategy and competing for PPE internationally rather than shortening supply lines by letting local NHS Trusts order PPE from local suppliers where quality issues can be dealt with quickly and deliveries can be made daily – without involving military aircraft. So much for following science.
The hunt for the guilty accelerates – where are those SAGE transcripts?
“It is likely that immunity will only last a few months at best (ask Professor Sarah Gilbert – the Prof leading the vaccine team in Oxford”
There again, she would say that, wouldn’t she?
Unlike most of those who comment on the code for the Imperial College model, I can say that I have been there, done that and got the t-shirt, i.e. I have created models for academic work that have become the subject of intense political controversy. The comments by Sue Denim are based on a substantial amount of hindsight and expectations that are unrealistic for academic teams who do not have access to the resources necessary to meet the best coding standards and are often under extreme pressure to generate results quickly. I have little doubt that every model that I have produced could have been coded better, but that is really not the point with 99% of models. We should remember the aphorism that “all models are wrong, but some of them are useful”.
Nonetheless, there are strange features of the Imperial College model. No-one that I know would have coded a model of this kind in C++ at any point in the last three decades. Most academics would use Matlab, Python or a large variety of high level packages/languages – according to taste and age. Using C++ (or, for the older of us, Fortran) is an open invitation to bugs, memory leaks, buffer overwriting, etc. which lead to the “random” results highlighted by Sue Denim. Of course, the model may also have been deliberately stochastic – i.e. it may rely on random number generators to derive a distribution of outcomes – but there has been little attention paid to the stochastic features of the results and in any case there are much better ways of doing this than writing C++ code.
What this review highlights is the complete failure of bureaucrats and politicians to go through a reasonable system to test the results of such models when and if they rely on them. There are many other epidemiological models around and the big failure seems to have been to rely heavily on one set of results without trying, even in a short period of time, to develop a consensus about broad conclusions rather than detailed numbers. Errors are not especially important if there is broad agreement across modelling groups.
The real problem is that no such consensus exists; this decision was made on political grounds and in a panic. Personally, I think Neil Ferguson was foolish to allow himself to become the focus of the supposed “scientific” advice underlying a political decision. Intense media attention is both seductive and fickle. The lesson to learn now is that future policies must be based on a broader discussion of both epidemiology and policy options. There is, now, a huge amount of evidence from around the world that is largely being ignored by those who seem more concerned to defend what was done and rather less to work out reasonable trade-offs between health and economic outcomes.
Actually, I still code in C++. In fact anyone who is writing basic algorithms which are slow still writes in C++ for efficiency reasons. If you are using Python to call machine learning algorithms for example, you are still calling some C/C+~ code underneath the hood for the basic algorithm itself.
Whilst everything Anonymous has said about C/C++ is true, and whilst there are much better languages out there for this sort of thing, I also agree with earthflattener — the only times I’ve ever used C++ has been when the code has to be efficient, and in this case that was probably a key requirement. Running Ferguson’s model on my reasonably-fast modern PC in multi-threaded mode takes about 20m. Fifteen years ago, if you wanted to do multiple runs of this model (which you’d have to whilst tweaking it) then writing it in Python was probably a non-starter. Also, you’d run out of memory.
On the one hand, this is a thoughtful review of Ferguson’s code. On the other hand, how will defunding academic epidemiology pay for managers? Wtf is wrong with you? Has it occurred to you that this code is so bad because there’s no money in epi modeling? You can’t insure for things like pandemics because the risk is highly correlated so any insurer would fail if the event happened — so there’s no insurance money in infectious disease research, which means that if the government doesn’t fund it, nobody will.
For context, I’m finishing up a Ph.D. at MIT and I looked at jobs in England. Their academic salaries are so low, at first I thought I was reading the job postings wrong. Should we be surprised that decades of underfunding a discipline that’s mostly a public good means that now, it’s not very good?
That said, I’m totally on board with the point about credentialism. I’d love to see more industry-academy-government collaboration. I’d love to see more and better open-source efforts. Insurance, though? Really?
There’s a discussion of private sector epidemic models in the natcat insurance industry here:
https://www.reddit.com/r/LockdownSkepticism/comments/gesrvr/code_review_of_fergusons_model/fprd6da/
Epidemics are frequently local (FMD 2001, Ebola, SARS etc) so I’m not sure why they’d be uninsurable compared to other types of natural disaster, especially as they don’t tend to destroy physical assets.
Insurance companies sell life and health insurance, and those do cover illness from epidemics, so they have a strong profit motive to get their predictions right. Right now, just for cash flow predictions, a life insurance company should be trying to figure out how many excess deaths there will be in each month of 2020 and 2021. I’m sure they’re at work on that now. I don’t know whether they’d be willing to let other insurance companies see their results, though— it’s valuable proprietary information.
The “Long March Through the Institutions” continues – why listen to people who know what theyr’e doing, can do it, and are the ones actually doing it, if instead you can stuff the media, academia, civil service and pblic services with your lefty mates who know nothing more than being good Party comnmisars.
And we end up where we are- the media “holding government to account” with left-wing political affiliates activcists time after time, without transparency of revealing their affiliations of course, a civil service who will be the ones hiring the likes of Ferguson simply because he is an academic and a Remainer, the latter being the important part.
and so on, our entire public sector is broken, destroyed from within whether that’s Ferguson’s directing policy or Diversity Managers trousering all the cash we throw at the NHS, or policemen arresting you for the “hate crime” of criticising grooming gangs going around raping children.
It is necessary to change the incentives. All publicly funded science should be published in full, or the public gets its money back.
Public policy must be based entirely on fully published science- no code or no data and the “science” gets rejected.
Scientists will then employ the experts necessary to make their work credible.
A few points:
Issue 116 has been fixed. See here: https://github.com/mrc-ide/covid-sim/pull/121. It’s not much of a gotcha when code is thrown open to the wider community, bugs are spotted, and fixes are found. That’s how Github is supposed to work. It does call Ms Denim’s good faith into question that she doesn’t mention that the error she devotes so much of her post to was found and repaired to the evident satisfaction of the Edinburgh red team (read the thread I linked to).
“This sort of work is best done by the insurance sector” has to be some kind of joke, right? That would be the same insurance sector which hasn’t modelled this risk and wouldn’t sell contracts for it? Or is there some other insurance sector that Ms Denim has in mind? If there’s an actuary out there with a better model, why didn’t they offer it to the government? Where are they now?
Finally, it’s worth putting this in a global context. Germany didn’t shut down because of the Imperial model, nor did France, or Italy or (fill in the blank). At most you could say that the Imperial model gave our government cover to do what everyone else was already doing without losing too much face.
This an excellent analysis and illustrates problems that are endemic throughout academic computational modelling. A research group will have a code that has been worked on over a number of years. Maybe the group leader/principal investigator even worked on it to start with, but now rarely touches it. Instead the PI needs funding to support the group, so they write a grant application that says were going to use our code to investigate x, y, z. They get funded and employ a post doc – someone with a background in the area of science they are involved in but not a professional engineer. The post doc spends a couple of years hacking the features they need into the code base so they can generate some results that look plausible to the research grant topic. Then they write one or more papers describing the results.
The process is repeated and the code base grows, with different PhDs/post docs adding features here and there, but with no overarching architectural considerations of how the software is put together, and no good engineering practices – code reviews, rigorous testing etc.
No doubt the funding body and the research group will pay lip service to following good engineering methodologies but no one looks too closely. The only people they have to convince – paper and grant application peer reviewers – are playing the same game too. These days a journal might make authors submit their code with their paper but its unlikely anyone will investigate it, and even if they did its unlikely they could tell what the code was supposed to be doing because it is badly put together.
The people involved have no incentive to produce robust code; the only requirement is to generate results that can get published and support the next grant application/REF. So the quality of the code is an afterthought as long as the results look reasonable. No one has an interest in rocking the boat – this is how academic software is developed and how academic careers are built.
Right, because an academic group can easily pay a software developer (when did people start calling programmers engineers? as if making a website to sell people pants is comparable to designing supersonic bombers? Please.) if they wanted to. No, their salaries are garbage, the PhD students have to hustle to write a thesis and can’t waste time cleaning code.
If governments want better results in this sector, they need to be willing to pay for it, not blame people they put in the Kafkaesque nightmare hell scape that is modern academia.
That’s not really the point. Engineers aren’t paid to develop the software because no one involved has a motivation to produce high quality software. Everyone involved is in a constant cycle of chasing publications/grant funding. It’s not a problem for government to solve either – academia needs to take software development much more seriously (rather than just seeming software as a means to quickly producing punishable results).
Now Microsoft if refactoring the code, yet another Gates affiliated organisation can exert influence over global politics to his agenda. Enough is enough. https://github.com/mrc-ide/covid-sim/issues/163#issuecomment-625158780
Microsoft… Gates
Of course! Never spotted that.
Did my point just get taken down? Can I know why? Was it cause I questioned the conspiracy type argument above? Or that I mentioned Breitbart?
I’ve checked the database and we don’t have any comment mentioning Breitbart apart from the one I’m now replying to. I don’t know what happened but you’re welcome to re-post it.
You mean some sort of conspiracy to silence you?
I am not scientifically educated, but I used to read source code for a living in software houses and so I took a look at this code when it was first released.
Some thoughts.
It is no surprise that software produced in a university is amateurish. Even in a small software house, things like documentation (my speciality) and testing are separate operations from actually writing the code. University departments do not have the resources to create professional software.
I would certainly have missed the issues about threading, seeding and reproduceability. Few people have what it takes to get to the bottom of a piece of software and fully grasp the underlying algorithms. The art is to identify the people who know what they are talking about and listen to them.
My rule is this. Always listen to those who want to understand the world, not those who want to change it. Look for the people who want be right and want to talk about their views. People who “would love to discuss it some time” and “obviously can’t go into the details now” generally do not know what they are talking about, and are often found in administrative and advisory posts.
There is no surer sign of the “we doctors” attitude than references to “The Science” as a corpus of knowledge rather than a body of problems to be investigated. People who talk about The Science want you to suppose that scientists never disagree and to treat them as an authority. They do not want to be understood, they want to be obeyed.
However, the fact that Neil Ferguson is a second rate programmer does not mean that the crisis is a scare. Many scientifically knowledgeable people think it is very serious, and the more scientific the more so – Matt Ridley, Christopher Monckton and Greg Cochran are all people who want to be right and think that the lockdown is the right response. I do not think we are out of the woods yet.
Isn’t mentioning Google employment a form of credentialism?
It’s a pile of dreck, and everything bad you have to say about it is probably understated.
I’m guessing that they are casting byte arrays into integers, which works on some architectures and doesn’t work on others.
The architecture issue is described in the Github comments. No mention of casting bytes to ints
I refer to the problem of adding new code to old code in hopes of pasting over ongoing problems “code wadding”, like creating a wad of chewing gum by adding new bits to it from time to time. Eventually, you have a big wad of code that nobody understands, doesn’t necessarily do what it’s supposed to do, but it has become so fragile that any changes can render it all unstable.
About this comment:
“Clearly, the documentation wants us to think that, given a starting seed, the model will always produce the same results”
This is a bad conclusion: the whole point is that, since the model produces variations on each run, you will have to run it multiple times and average the results. This is precisely what Monte Carlo is for… This is the reason why, for example, photorealistic renders take so long – each pixel needs to be calculated over and over and over again, producing a different value on each step – averaging the results will converge into a correct value.
I’ve read a lot of the comments below. I think we seem to have lost sight of the main issue.
Given his track record from previous simulations being wildly inaccurate to say the least – who made the judgement that he should have input this time?
There are several very interesting contributions on this matter on Unheard and Quillette. When I get back to my laptop I will try to edit in some links.
Here are the links I mentioned.
https://quillette.com/2020/04/29/podcast-88-jonathan-kay-on-covid-superspreaders/
https://unherd.com/thepost/nobel-prize-winning-scientist-the-covid-19-epidemic-was-never-exponential/
https://unherd.com/thepost/coming-up-epidemiologist-prof-johan-giesecke-shares-lessons-from-sweden/
The infamous Imperial College worst case projection of 2.2 million U.S. deaths was simply the arithmetic of assuming 81% get infected and the infected fatality rate is 0.9%. But neither assumed datum is remotely credible. Only 28% were infected during the 1918-19 Spanish flu, for example. The 2.2 million death figure did not just assume governments “do nothing.” It assumed people make no effort to protect themselves or others – they keep doing just as many hand shakes, hugs and swing dances; there would be no more hand washing, no less subway traffic, no purchases of hand sanitizer, Clorox wipes or face masks. Despite all that, the U.S. media immediately took the model’s typically sensational estimate very seriously. https://www.cato.org/blog/how-one-model-simulated-22-million-us-deaths-covid-19
This sounds similar to East Anglia’s crude and sloppy global warming coding that was revealed in Climategate. A comparison post might be helpful.
In the financial services industry the benchmark standard for model governance is a set of guideline published by the Fed in 2011 known as SR 11-7 (https://www.federalreserve.gov/supervisionreg/srletters/sr1107a1.pdf#page16)
Other regulators have similar expectations.
The Bank of England and PRA would probably immediately instruct a bank attempting to use a model with these deficiencies to cease using it and to remedy all the deficiencies, potentially with penalties for the failure to adhere to model governance expectations.
If the standards for banks (where the risks of failure are not comparable) are so high, it is hard to understand how it the standards for a model used to make fundamental national decisions are so low.
Incidentally, I strongly suspect that exactly the same is true of models predicting the climate change impact of CO2.
The IPCC agrees with you
IPCC 2001,AR3 Paragraph 5 section 14.2.2.2:
“In climate research and modelling we need to recognise that we are dealing with a non-linear and chaotic coupled system and therefore long-term predictions of future climate states are impossible… “
I’m not sure I understand the nature of your criticism about a quantity that is both input and output. A model needs initial conditions; when these are not completely known, a suite of simulations using a range of realistic initial conditions have to be performed. Now if small changes in the initial state of the system produce large changes in the end state, then it’s a problem that can either be a numerical instability in the code, or that the model itself is chaotic. Is that what you’re concerned about or just the general way initial conditions are implemented? Btw, I haven’t looked at the code.
Concerning issue #30: surely this is a compiler problem?
Utterly ridonculous.
Without regression tests, you can’t know that the new code you’ve introduced hasn’t broken something else. So you run the risk, even when you’ve made the most minor of changes and have added a unit test for the new piece of code, of breaking some other untested portion of the code base. And if it still runs without noticeable error, you may never know what you broke until much later.
15k lines in a single file? I highly doubt you could code any sort of reasonable model intended to predict the spread of a virus with less than 500k lines of code. The number of things that would have to be accounted for are mind-boggling, including the climate for a particular region and the season. What about virus characteristics?. Its virulence, means of transmission etc etc. My guess is that the game ‘Pandemic’ has a working, tested and infinitely more robust engine than this obvious abortion that could be used as the basis for something useful, assuming they’d be willing to share their copyrighted code. But, I bet that the open source community could create an engine for this sort of use that would tell its users with far more confidence the possible effects of any pandemic. Just the variables are an entire project on their own.
That’s what you have to start with on a project like this. What are the variables and how do we then use those data points to plug into the actions that would test their effects.
This is a guy coding in his basement who has little to no inputs and data points with a bunch of code spaghetti making assumptions that came out of his addled brain.
This just goes to show you that the epithet “science denier” should be applied to the vast majority of politically-employed scientists.
By the way, this was a brilliant analysis. It kept to the most glaring and obvious problems that exposes Ferguson’s proprietary software’s complete uselessness. Was Bill Gates involved even in an indirect way in trying to rehabilitate this piece of code? It sounds a lot like Microsoft’s early software…..
It’s useless to talk about non-regression testing, people who do that kind of code simply don’t have the real-world experience to do tests. It’s academia code, and it’s very widespread across many fields. It is not usually needed in any other context than academia. Who wants a production-ready epidemiology prediction code in normal circumstances anyway… You should read the comment I wrote (when it will be moderated) ^^^
What you’re saying is that academia has no interest in producing working code. Any piece of code that doesn’t have regression tests cannot be worked on by other people and it can’t be trusted to be properly designed either. The fact that it spits out numbers is good enough. Without regression tests, you can’t even be sure that there is a proper expectation that it does what it was designed to do.
I want to make another point about academic code etc. These days many people are using Python for code development – especially in Machine Learning. This is used pretty much everywhere for commercial, academic etc and even critical code. Scikit-learn is one of the more common library useds, but same thing for R machine learning code. Big chunks of our modern life has code running this and similar libraries. Most of it is written by academics with the deep code being in C, C++ or Cython.
The criticism about the code made here applies in full to pretty much any code which calls these type of libraries…..i.e pretty much everything written these days. I have looked deeply at some R code for some of the machine learning algos. They are not pretty, but they work just fine.
Again, I’ll note that the main bug that Sue refers to is not of any consequence for a user of the code, because it is an ensemble method (see my main points below for detailed analysis)
“Undocumented equations. Much of the code consists of formulas for which no purpose is given. John Carmack (a legendary video-game programmer) surmised that some of the code might have been automatically translated from FORTRAN some years ago.”
FORTRAN? Respectfully I used to code in FORTRAN the last time in the 80s. Is it still used?
See my comment up there ^^^
Modern Fortran is really rather cool, though not much used admittedly. However, good working code remains good working code. NASA and others still rely on chunks of Fortran code written in the 70s. My company still has parts of its fluid flow simulation code in Fortran. in If an algorithm is simple, clean, bug free and not likely to need to be revisited, why bother to risk a change. Fortran is still very fast.
That said, I haven’t used it since the 80s either,
Haha, I’ve been downvoted by someone on this. What? Are there a bunch of Fortran haters lurking in the shadows? “I hate that language and spit upon anyone who uses it”
So in one sentence can you state what this means to the average person in the street( or not with isolation rules )
I’m a numerical analyst, and in addition to the coding issues, the fact you get different results on different computers suggests to me that the algorithms that are being used for the calculations are ill-conditioned, with the result that rounding error is being turned into “data”. Another example of amateur work.
The problem with all “academic” work is that it is never properly tested at the detail level. Academics get promoted and rewarded according to the numbers of papers and citations they get. Publishing “interesting” results is what is important. Normally this doesn’t matter much because it has no impact on the real world, where what someone does has to work, and produce income.
The last people who should be let near the real world are such people – they are always sure they are right, and emotionally and overly narrowly committed to what they do, considering it supremely important, and their opinions matter far more than those of people in the real world.
Over 30 years ago, I used to be a scientist providing and testing and reviewing the work of academics for possible use by HMG, and a lot of it on inspection was as bad as Ferguson’s. But the civil service then weeded that stuff out, and it got nowhere near the top level decisionmakers
I believe you have probably done ‘a course’ in numerical analysis…not so sure about all the rest.
If it hasn’t been mentioned yet (heavily commented thread!)
Sue Denim’s issue: https://github.com/mrc-ide/covid-sim/issues/144
was closed as I commented earlier, with “mrc-ide locked and limited conversation to collaborators”
Justin’s issue: https://github.com/mrc-ide/covid-sim/issues/165
lasted a mere 12 hours before being closed as “mrc-ide locked as too heated and limited conversation to collaborators”
A mere observation, but contrast to every other issue one notices these two seem somewhat unique in eliciting a result of being closed/locked/limited to collaborators.
A request to re-open issue 144 was made in issue 179 https://github.com/mrc-ide/covid-sim/issues/179
Actually, as correction to my wording:
issue 144 is that issue *referenced* in “Sue Denim’s” review, not “Sue Denim’s” issue. Not that it matters all that much for “Sue Denim”.
What a useless post.
What kind of Google engineer doesn’t know how stochastic simulation works?
And then to make a public post showing off your ignorance. Pathetic.
What was the point of this useless post? To show off your ignorance?
Looks like it.
Pathetic.
This reminds me a lot of the kind of work that is done by grad students in other fields that use C++ even though they don’t have any experience with it. It’s a common thing in academia, in other fields than software engineering.
How it usually works is that some older professor starts doing the code in Fortran in the 80s-90s so that its one big dump of code.
Then some grad students eventually translate it in C/C++. Since they are not familiar they don’t know about versioning. They don’t use SVN or Git either and no one stays long enough on the code to ever bother doing it anyway. They write papers and thesis based on stuff they added to the code (which is not in a repo, remember) and it gets published easily because of the Professors’s reputation of the first code.
One example I have is medical physics. Grad student that come out of their B.Sc. and don’t know about C++ but are still asked to come up with radiation dose computation algorithms. I agree that it’s not a problem usually because over time they learn how to use it and their code gets reviewed eventually and it takes a few years before anything is actually used in a clinical setting.
It makes me think that the first version was legit, but then other grads (with no experience) came in and used it to add features so that they can write their Thesis. And then this code, in as shitty condition as it is, is urgently needed to compute predictions about an ongoing pandemic… and then everything you just said ensued.
Oh and yeah think about it… under normal circumstances who needs an epidemiology algorithm that is production ready… No one in his right mind thought it would someday be urgently needed. No one of the computer engineering-y guys here would have ever wanted to do a cleanup of this mess anyway. And that’s the reality of things. People are focused on doing web-based server-client ruby fast delivery html5 software engineering bullshit… And it’s coming from an actual scientific software developer who does automated defect recognition in x-ray inspection of cast aluminum parts… and the Medical Physics example is me.
It gets me really pissed to see all the comments that say use this or that production code technique. This kind of code never leaves the 8th basement of the university where its kept by some post-doc dude that has too much to do in the actual lab than take the time to make all those improvements. And then suddenly the world needs it. I think some people here need to take a good hard look at themselves and say why is this happening? Welp because theoretical fields like this are under-funded that’s why.
And by under-funded I mean 15-20k per year in research grants to grad students… those who get them. I’ve been there, I know how it is. Having to do 40h a week of stupidly hard research work and still wondering if you’ll be able to pay rent the next month. People that make that kind of money don’t give two shits about the code. While all those undergrad computer science guys get full salary paid internships. And if they get their Ph.D. there’s no job for them in the market anyway… so they’ll end up being in the academia limbo forever. I suggest you look at the movie made by Ph.D. comics… it shows a bit what it’s like. Because it’s not that funny in real life. http://phdcomics.com/
Partly true…well definitely true that it’s underfunded. But the point is, if you look at the criticisms of the code that are made in the article, they don’t hold water. The bug she talks so much about doesn’t affect the final result (it’s an ensemble algorithm!). As to the rest, sure it’s probably spaghetti code, but that is pretty common. If you lift the hood and look at the ML algorithms in R or scikit-learn, they are not exactly state of the art (e.g Random Forest in R), but they are very solid code.
Curiously I did something a bit similar to you as an application of the algos I was developing for my PhD. Looking at segregation in steel, but at a relatively small scale using electon microscopy back in the late 80s (in Fortran!). The ropy code that I wrote was still used in the French steel industry 15 years later!
That’s exactly what I’m talking about! This kind of stuff is everywhere in academia and suddenly people start to throw shit at one because it “lead” to the lockdown. I am personally used to using Geant4 for radiation physics and It’s almost the same thing! It has been going for yyeeeaaarrsss, and was eventually cleaned up because someday some company decided to use it. https://geant4.web.cern.ch/
This thing is being widely used by people who build nuclear power plants! I don’t see anyone having seizures on the floor because of it…
Your first sentence would be possibly an excuse, but Ferguson has used his models to produce wild over-estimates of the impact of epidemics before. Hence not excuse whatever.
My name is Gary McInally, and have spent more than 20 years developing models within the insurance industry. I’m part of a C-19 initiative lead by members of the Institute and Faculty of Actuaries and would welcome the opportunity to talk to the author in confidence. To be clear, I’m not a lockdown sceptic but I am keen to improve the understanding of the extent to which given epidemiological models are for for purpose and the communication of their limitations.
Why don’t you just put your thoughts here? It might help educate the rest of us.
Sue Denim states that they are a computer programmer by profession. I am an astrophysicist. Neither of us is an epidemiologist, but I thought I would share my thoughts regarding their dislike of the Imperial code and their apparent distaste for academia in general (‘academic epidemiology [should] be defunded’).
We (academics) *know* that we write bad code, or at least code that would probably not be up to the standards of Google, etc. Admittedly I am not trying to predict death rates from a pandemic, but there is plenty of code that I have written over the years that would be pretty incomprehensible if released to the public.
Why is this? Time and money. The author themselves stated that insurers have managers and professional software engineers to ensure model software is properly tested and understandable, which academic efforts don’t. Academics would love to be able to employ a professional software engineer to work with them and make sure their code is up to scratch. Occasionally someone does manage to scrounge together the funds to do so, but most groups simply do not have the money to hire a professional software engineer. Academia is a constant game of trying to spread the resources you have as far as they will go.
Many academics are in fact aware of how to write better code, but because they are constantly under pressure to write the next paper or the next grant proposal, they often don’t have the time and just get something written quickly for the specific task at hand. Then when they find themselves in the same situation later they remember they had written something similar previously and use that older code as the basis for a new one. This continues – often through many different graduate students or postdoctoral researchers on short-term contracts – until you end up with a Heath Robinson monstrosity that everyone is too afraid to try and re-write from scratch because of the massive amount of time (with no big publication to your credit) it would take.
By all means encourage government to increase science funding and require that a large coding project employs a professional software developer, but if you just gave the money that academic epidemiologists have used to do their work to the insurance industry and asked them to do the same job, but producing better code, they would laugh at you.
The exact same point I made, I have a bachelor’s in physics also
I’m a stochastic modeller, working in industry with high quality profession programmers in my team. It is true that they take my spagetti code and turn it into something good….but, and it’s a very important but, my code still remains the numerical gold standard. If their stuff gives a different answer, it is presumed wrong, not saying they don’t find the very ocassional bug in my code :(, but 99.5% of the time they will have made the mistake becasue they don’t understand the maths. So for code that is used only by experts, there is little need to go through the whole professional software route which is costly and slow. It is critical for commercial code that needs long term maintenance, platform independence etc., but that is a very different animal.
Incidentally, this code was 10,000 lines long. Big, but hardly a giant. A grad student could have re-written it in a couple of months. Something we have done in academic departments in my discipline where we have some people doing PhDs which are a sort of crossover between computer science and domain science.
I do not believe this writer has the knowledge or capability to properly criticise Imperial’s work.
This article does not read as if it is written by a genuine high level programmer. Instead, this article undoubtedly serves a particular political agenda.
Funny how often it is that people shouting about their scepticism show no scepticism at all over something which purports to support their own prejudices.
All the complaints about the seed read like someone not at all familiar with the law of large numbers or central limit theorem. What they did is poor form but not incorrect statistically if they run the model many times…it just means that they didn’t set a seed. But a population of random seeds for one person will converge in distribution to a different population of random seeds for another, which was the exercise the authors were running.
Exactly right. I’ve been making this point earlier on the thread.
URL Link for “the model attempts to simulate the impact of digital contact tracing apps” is wrong. Please fix it. Thanks for your effort. This helps me so much.
This is a pretty naive critique, although I agree with it in part. The implicit and explicit conclusions are pretty stupid, however.
Firstly, the root of the matter is that epidemiological forecasting is a speculative and incomplete area. Nobody knows how to do it. We don’t hire epidemiological modellers to forecast pandemics, we hire them to figure out how to forecast pandemics. The fundamental issue is that this discipline is unplugged from reality as there are no controlled experiments, and the experiments that we do get (such as COVID-19) occur very infrequently. I suspect that the researchers panicked because they knew, qualitatively, that the window of time to prevent catastrophe was very limited. If COVID-19 had been far more deadly, the author of this post would probably be condemning them for their silence, if this puerile treatise is anything to go by.
So, the scientific literature in the area does not consist of any Newton’s laws of epidemiology; instead it consists of many totally speculative, spaghetti on the wall attempts at figuring out epidemiological modelling. The code is not a completed product that produces truth, because it has never been tested against reality until now.
As there is no known way to forecast epidemics, industry would not really be able to do a significantly better job; insurers are wiped out all the time when freak events such as this one occur. Industry relies upon science to develop the fundamentals, then works on professional implementations.
Academics have very thin resources and work in a pretty insane environment. Only the lucky ones get to hire permanent staff programmers; the rest of them rely upon trainees (PhD students) who leave the group after three years. They spend all day writing grants. Ferguson may have been put in a situation where the incentives to keep his grants incoming made him make judgment calls with limited resources, and he probably had to gamble that his spaghetti code (which likely started its life 20-30 years ago) worked well enough that he would be better off devoting three years of a grad students life to adding features. If a grad student spent three years fixing the code, they would not be able to obtain a thesis.
Now, Ferguson seems like a jackass, and may have moronically hustled his models beyond their actual value. If I worked in that field, I would probably have avoided publishing on COVID; he looks like a bandwagon academic. But we need to bear in mind that the system is set up to produce bandwagon academics and to reward them.
Finally, scientists studying epidemiology are clearly important, and the idea of de-funding all of them is ludicrously stupid. The issue is that the system is dysfunctional and littered with perverse incentives. Moreover, the public does not seem to understand what the papers represent in a field such as this one which has no real experiments. If you spend millions or billions of dollars hiring people to do something impossible and unplugged from reality, you’re going to get bad results.
Don’t throw the baby out with the bathwater.
I had time to look at the code… conclusion from a real scientific developer: It’s a chaotic system. Just like for weather predictions. A rounding error in one of these algorithm can drastically change the outcome, that’s why you run multiple simulations in order to get a window of outcome. I’ll copy what I wrote on the github so that you guys (and the author of this) know what I’m talking about.
No one understands the code I do… even senior architects and developers because they’re not nuclear physicists, that’s what happening here. And I already had, on multiple and multiple occasions, to explain to those computer science guys that integrate what I do why it might not always be 100% repeatable. Do you know anything about chaos theory or non-linear algebra? When you have so much variables in the code that even the slightest bit of rounding error in a double can drastically change the outcome of the simulation? I’ll leave it here for you guys. Take a good look at it. It’s why meteorological predictions are bad. https://en.wikipedia.org/wiki/Chaos_theory
and especially this one: https://en.wikipedia.org/wiki/Butterfly_effect
They say it takes 20GB of ram to run… that’s a pretty good hint at that… so yeah so much for your google experience… if you don’t have a clue about what you’re talking, you ask experts… gosh is it that hard to find someone that knows that?
They need to start from scratch and completely rewrite the thing with people who can use modern C++. This code has nothing to do with that. It really is only C code. If they don’t have such people, make it open source and hope you can attract some interest.
They need to start from scratch and completely rewrite the thing with people who can use modern C++. This code has nothing to do with that. It really is only C code. If they don’t have such people, make it open source and hope you can attract some interest.
nice one
On Monday I got so angry that I created a change.org petition on this very subject.
https://www.change.org/p/never-again-the-uk-s-response-to-covid-19
Ferguson is typical of the “Do as I say, not as I do” political class sh******. His projections on SARS, MERS, CJD, Covid, etc., were less “scientific” than Mystic Meg. A Remainer & Soros admirer.
I am not scientifically educated, but I used to read source code for a living in software houses and so I took a look at this code when it was first released.
Some thoughts.
It is no surprise that software produced in a university is amateurish. Even in a small software house, things like documentation (my speciality) and testing are separate operations from actually writing the code. University departments do not have the resources to create professional software.
I would certainly have missed the issues about threading, seeding and reproducability. Few people have what it takes to get to the bottom of a piece of software and fully grasp the underlying algorithms. The art is to identify the people who know what they are talking about and listen to them.
My rule is this. Always listen to those who want to understand the world, not those who want to change it. Look for the people who want be right and want to talk about their views. People who “would love to discuss it some time” and “obviously can’t go into the details now” generally do not know what they are talking about, and are often found in administrative and advisory posts.
There is no surer sign of the “we doctors” attitude than references to “The Science” as a corpus of knowledge rather than a body of problems to be investigated. People who talk about The Science want you to suppose that scientists never disagree and to treat them as an authority. They do not want to be understood, they want to be obeyed.
However, the fact that Neil Ferguson is a second-rate programmer does not mean that the crisis is a scare. Many scientifically knowledgeable people think it is very serious, and the more scientific the more so – Matt Ridley, Christopher Monckton and Greg Cochran are all people who want to be right and think that the lockdown is the right response. I do not think we are out of the woods yet.
These arrogant “scientists” have forgotten in any risk assessment you present the best case scenario & the worst case, NOT just one. We have seen various failed economic models before 2016 turned out to be wrong! ”Project fear” by the Bank of England, IMF, EU, World Bank, Soros, Branson & similar one-world idiots. Also on the climate, which were fiddled in 2009 & continuing, with 6000 remote weather stations closed.
Isn’t mentioning Google employment a form of credentialism? Your comments about the code are ones that a enwp.org Person_having_ordinary_skill_in_the_art would be able to see.
It’s a pile of dreck, and everything bad you have to say about it is probably understated.
I’m guessing that they are casting byte arrays into integers, which works on some architectures and doesn’t work on others.
Having read a lot of the comments, pros and cons, my support of Ms. Denim’s claims is still valid. Even a model should produce the same result given the same input. The variation can legally only be in the input data or in some of the parameters or coefficients. Otherwise you never can tell a bug from a feature. How could you possibly assess the sensitivity of certain model parameters, if the output is non-deterministic ?
I invite you all to review President Dwight Eisenhower’s farewell address. It is readily available to watch, listen, or read. Pay particular attention to section 4. Everyone always fixates on his admonition regarding the military-industrial complex, but they ignore his second of only two warnings. Indeed, it is this second warning that may yet prove more prescient and important than the first.
He tells us that “The prospect of domination of the nation’s scholars by Federal employment, project allocations, and the power of money is ever present, and is gravely to be regarded.” He later elaborates “we must be alert to… the danger that public policy could itself become the captive of a scientific-technological elite.”
Ike had this problem nailed sixty years ago, and his second warning now appears both more prescient and more important in its long-term pertinence than the first. It is more relevant today than it was when he penned and spoke the words.
“Imperial’s modelling efforts should be reset with a new team that isn’t under Professor Ferguson, and which has a commitment to replicable results with published code from day one.”
These problems are being found this this code because it happens to be being scrutinised. The vast majority of academic software is like this. If Ferguson (or whoever) had not acted in this way he would have failed to be appointed as a professor and someone willing to do so would have been appointed in his place.
The number one reason this happens is that academics selected for producing research results are much cheaper than professional software developers. Everyone in academia and the government funding agencies is aware of it and accepts it as normal.
If the government wants quality software for COVID it should contract for it with a company and pay market rate to specialists. The company can hire the professor as a contractor to provide the equations rather than the professor hiring cowboy programmers as contractors to provide code.
Regarding the use of C++, back in the 1990s there was a joke:
Q. What’s faster, C++ or Visual Basic?
A. Visual Basic, by 6 months…
The point being that as VB6 used the C++ compiler, you could crank out *sufficiently* OO natively compiled code that was easily readable, without spending all day tracing null pointers because one string was too long, once.
But at least in those days you could use regression tools like SQA Teamtest, but these days all the work I am asked to do is to make quick changes and fixes to Excel and Access front ends to SQL Server databases, and even Microsoft doesn’t have any regression testing tools available for Office products.
Who knows how much of our civic life is based on Excel taken beyond its comfortable limits?
I read the article 3 times, I could not believe what I read.
Surely from the beginning someone will have created a requirement, with posts which you needed to meet and test against, and eliminate any bugs before going forward.
Seems like suck it and see methodology, if when I programmed I wrote anything as unstable as this I would be been shown the door.
We had standards to follow, from a design idea, to a document, to charts to code & documentation in the code and for the user. All the stages had and sign-offs that needed to prove the concept, all managed through project schedules
Then adding features would be a no-no, until the software was reliable.
Worst code is “not available” ….
I’d like to thank the person who wrote this document and bringing this to the attention of the people. I looked as some bits of the code… I leave it at that.
You bring up valuable facts but misunderstand the underlying problem. The real problem is that pandemic planning has had such a low priority in the UK that the country has relied on research-grade models to make policy decisions. It’s like flying prototype planes designed at Cranfield to fight a war and berate scientists for these planes not being good enough.
The way academia is evaluated and funded makes it incredibly hard for academic labs to create and maintain robust industry-level software. This is a well-known problem. Yes, academics still need to try to go beyond thousands of lines of undocumented code without tests, but they have very limited time, funding for academic software developers is limited and best people either end up in industry or drift towards more research-based roles – as opposed to doing production within an academic lab.
Nonetheless, there is immense value in Ferguson’s and his colleague’s innovation and expertise. However, their role is not to build a production line for war planes. Their careers are built around designing new methodologies, implementing them in prototype models and conceptualising their results. An engineer can build way better software, but it does not make them a epidemiology expert.
Although I’ve been critical of the ICL modelling (in particular the lack of sensitivity analysis of the model results, and the stupidity of choosing the date of the first recorded case as the starting point for the SEIR model), your critique should not include the claim that the same inputs ought to predictably generate the same outputs.
Consider if an input is a parameter to a probability distribution that generates transition probabilities (say, of going from ‘Exposed’ to ‘Infected’ in a SEIR model).
For example, if α and β from a Beta (or Gamma) are inputs, and the transition probabilities are being drawn by simulation from the distribution at each point in time. This is a useful way to get a plausible average transition probability, while having random transition probabilities for representative sub-groups.
In such a situation, every model run would generate different transition matrices.
If I were you I would get rid of that part of the critique, because it will be seized upon to discredit the entire critique.
Note that drawing random samples ‘in model’ makes for a poor ‘one-shot’ model (where the model is run once and the results declared as ‘the’ model output). That is beyond dispute.
However ‘one shot’ modelling is almost entirely stupid – and intellectually indefensible – because of
• uncertainty in parameters and exogenous variables (the latter is glossed over);
• the fact that even linear models are generally not bijective and so may not be mean (or mode) preserving. So if you feed the model the mean (or mode) of the parameters and exogenous variables, the model will not generally return the mean (or mode) of the endogenous variables.
.
My own view is that the core problem with the ICL model is that the people running it – Ferguson especially – are catastrophists who want to tell a story, and they ‘know’ their model.
Any quant who genuinely understands their model, knows how to change the parameters (and exogenous variables) to produce whatever path they want – without moving the parameters outside of an envelope that passes statistical scrutiny (e.g., an ML test would say that the parameter set as a whole was not statistically different from its ML values).
FWIW: the non-bijectivity of even linear models, was a key motivation for my my PhD research – which was specifically focused on stochastic sensitivity analysis in very large-scale dynamic computable general equilibrium (CGE) economic models.
The problem of being able to get whatever results you want (including changing the sign of key outputs) was a key preliminary finding of, and primary contributor to my decision to abandon, my PhD.
This ‘tweaking’ is significantly easier in smaller-scale models (e.g., macroeconometric models – which tend to have a few dozen equations). By the standards of CGE models, the ICL model is tiny.
To convince myself such tweaks were easy to find and implement, I built a ‘tweaker’ in which the target values of the (normally endogenous) variable of interest were ‘swapped out’ (i.e., given as inputs), and the model was set to find values of parameters and other (normally-) exogenous variables that would hit the target, while keeping the entire exogenous set within a hypercube that passed tests.
That sort of “closure swap” is trivial to do: it is therefore no surprise that models run by charlatans (e.g., climate models; ICL’s Chicken Little epidemiological models) reliably produce “The Sky Is Falling!!” on demand: nobody ever got more funding for saying “There is no crisis”.
To this day I know the precise date that it became clear that the modelling paradigm would devolve into ‘guns for hire’, telling government what it wanted to hear: Friday, 13 August 1999. I discontinued my PhD candidature a month later; if I was going to be a charlatan who used numerical models to add an undeserved veneer to my own priors, I may as well do it in a relatively highly-paid field (quant finance) rather than a low-paid (academic research).
You got this completely a*rse about face… the models start from the results they want and work backwards, just as government statistics start with the results required to justify government policy and work backwards.
If Pulitzers were worth a damn we would be looking at one here. Epic piece of citizen journalism. Ferguson is listed as an epidemiologist – how an individual calling the ball on a virus cannot have massive science credentials on viruses is beyond me. He has a Master of Arts degree in Physics. Why an MA and not an MS? Because an MA focuses on humanities, political science (use “science” loosely), linguistics, diplomacy, and government administration. You can just hear the postmodern babble ringing. As with the climate activists – I mean ”scientists” – models are about a politicized result. Anyone familiar with the hacked East Anglia Climategate emails has seen this in action. Ferguson did go on to get a PhD in physics but he did not follow the path of banging atoms together in colliders or chasing after Dark Energy, perhaps that math was little too deep for him. During the 2009 Swing Flu outbreak Ferguson’s models predicted that 65,000 people could die from the virus in the U.K., but no more than 500 died. I am reminded of the legendary Richard Feynman’s takedown of NASA’s models projecting that just 1 in 10,000 shuttle launches would involve catastrophic failure.
But this is good enough for government work. Agenda, agenda, agenda. Were it not for this hypocritical tryst with the married lover (another Kool-Aid drinker) he would still be carrying on for the next 20 years like Dr. Fauci who scared the bejezus out of my generation in the 80s projecting widespread AIDS outbreaks among heterosexuals via French Kissing. Must. Increase. Funding. By the way, imagine if this social distancing hysteria had been applied to the individuals contracting what was then called Gay Men’s Health Crisis. Never happen.
But we live with this partisan, lying doomsaying daily. EU and UK citizens pay insane gas prices including global warming taxes, entirely founded on outrageously loaded models. The watershed moment for this nonsense was “Population Bomb” by an idiot academician 50 years ago. Ferguson’s shoddy model is like SimCity without the graphics, I would go a bit further and say all these models are like the Reichstag Fire without the fire.
There is no MSc or BSc degree in Oxford, they are all called BA, and after a while MA.
The biggest problem with the code seems to be:
1) It’s legacy code which was poorly written, so difficult to debug and modify.
2) The equations are not documented, so you have no idea what you’re solving.
There are some well-written codes out there, but I don’t think that this is one of them.
So… Billions of lives put in disarray with the “help” of worthless spaghetti code that’s been politically massaged in order to give cover for a bogus emergency? Got it!
But nowhere in his paper does Ferguson advocate for full lockdown. It recommends school closure and case isolation, plus social distancing. No limits on work or domestic travel or exercise are outlined. So yanno. There’s that.
More accurately, I should have said the team that included Ferguson did not advocate a full lockdown in that paper.
From the paper:
“suppression will minimally require a combination of social distancing of the entire population, home isolation of cases and household quarantine of their family members. This may need to be supplemented by school and university closures, though it should be recognised that such closures may have negative impacts on health systems due to increased absenteeism. The major challenge of suppression is that this type of intensive intervention package – or something equivalently effective at reducing transmission – will need to be maintained until a vaccine becomes available (potentially 18 months or more) – given that we predict that transmission will quickly rebound if interventions are relaxed. We show that intermittent social distancing – triggered by trends in disease surveillance – may allow interventions to be relaxed temporarily in relative short time windows, but measures will need to be reintroduced if or when case numbers rebound. Last, while experience in China and now South Korea show that suppression is possible in the short term, it remains to be seen whether it is possible long-term, and whether the social and economic costs of the interventions adopted thus far can be reduced.”
These are the MINIMUM recommendations. And, they reference China where the minimums were greatly exceeded, so let’s not pretend the expectations were for a minimum response.
But more revealing is their recommendation of 18 months of suppression. This only makes sense if the spread of the virus is essentially stopped during those 18 months. But, despite having exceeded the minimums, the virus still spreads, which means that not only was their model bogus, but so was their recommendations on how to deal with the virus.
They were very wrong on both counts.
Whoops.
Is he supposed to advocate or not?
Anyway, the decisions taken aren’t his: it’s for those with the responsibility to take decisions that must take them (and to be judged upon them) ie the Government. It’s they that chose what to do, based on evidence from those they chose to advise them.
It’s up to the Government to choose its advisors; it’s up to the Government to choose what advice it accepts; and it’s up to the Government to determine what policy is pursued. That is what Governments are for. Advisors only provide advice.
So, if Ferguson’s work is so faulty (as claimed here) and yet nevertheless was determining UK policy, whose fault is that?
@thelastnameleft Yes, that is one of the most important questions here. We can surely fault our government leaders and anyone else for taking seriously these self-described scientists. But, unfortunately, most people don’t make a distinction between physical science and social science and we are all paying for it now.
Physical science employs rigorous testing and peer review to conclude that a theory is the best available explanation for an observed phenomenon in the natural world. It produces an ability to very accurately predict outcomes. We create all kinds of products based on this work and trust our lives to it on a daily basis, whether it’s cars, airplanes, medicine, etc. The work of physical scientists have created a well-earned reputation for science.
On the other hand, social science uses math to analyze observations of how people behave and claims this is also science. At best, they can take the same observations and repeat the outcomes, but these don’t scale well for predicting future events because it’s currently impossible to parametrize everything a human uses to make a decision.
For example, in physical science, if just one event does not behave as expected, the theory is no longer acceptable. In social science, the best they can hope for is a high degree of accuracy in their predictions. As such, ‘accuracy’ in the social science world is dangerously subjective.
For decades, universities have been reclassifying Humanities programs as Social Science. It just sounds better and since they sometimes use some complicated math, who is to say it isn’t real science.
Well, now we know the difference. The great sham of social science masquerading as actual science has revealed itself on the international stage with tragic results.
Why do we need such extraordinarily complicated ways of assessing a situation? Surely life’s experiences will tell us that when some nasty bug is doing the rounds we don’t deliberately put ourselves in the way. This virus should never have created the panic it has because all the other of recent years have come and gone without all this lock-down fiasco. The 2009 H1N1 epidemic was going to kill thousands according to the experts and yet only 398 succumbed with 26,000 needing hospital treatment.
It was right that the NHS was readied to deal with many more customers but to shut-down the whole of industry and commerce will prove to be suicidal for so many, even this government, who have been spooked whilst still following “advice” rather than “leading” from the front.
Still the big questions go unanswered. Is this virus something new or was it man-made and has been around since last October resulting in the lock-down in March being totally unnecessary.
Agree with code part… only challenge is that, I couldn’t digest funds to be given to insurance morons to develop models… haven’t you studied the 2008 housing market bubble and all the stochastic models pulled by Alan Moronspan from his arse and all other banking MFs? Get real mate!
This is a good article but you a hypocrite, you bemoan “rampant credentialism” but yourself are guilty of an appeal to authority by claiming expertise proven by your employment at Google where you were a senior surveillance capitalism engineer creating products that undermine society. This is not relevant, your analysis stands or falls on its merits alone.
It isn’t a good article, not least for the reason you mention. What is the substance of any of the “criticism”?
Taking one, the author claims
—–” a key part of the scientific method is the ability to replicate results.”
Consider a particle passing through a slit and lighting up a screen? If it hits a different part of the screen in one experiment to another is it not science, because it has failed to replicate any other instance? Nonsense.
If an assay of virus grows slightly differently in one dish versus another, is that un-scientific? Obviously not.
The author of this piece is obviously not in possession of a very sophisticated sense of ‘science’.
Another angry genius that feels the world has overlooked them. Yet, even with benefit of hindsight can’t put their name on a piece of work. I’m all ears!
Sue Denim doesn’t seem to understand ensemble modelling.
Right
Or how to write a criticism rather than a polemic.
I don’t understand how a program can produce different output when run with the same input. I have worked as a programmer for 30 years, and have done my fair share of debugging. One thing we could rely on was the fact that you take a copy of the input data that you are trying to debug for, and the output will always be the same. My background is with COBOL on administration type applications.
Thread synchronization issues or wrong initialization of prngs (The issue points out that they were not using the same seeds when processing the same data from two different sources, so I expect them to have the same issues with seeds not properly tracked across all their code).
If you’ve failed to correctly initialise a variable before use, for example.
The problem with this software is not primarily its code elegance, structure, comment style, whatever, it’s that it was used to create the numbers that pushed the UK government from voluntary social distancing to compulsory, with all the concomitant costs. The government had no choice faced with the numbers from a seemingly credible source coming up through SAGE. Now we are absolutely sure that the source of those numbers was not credible, even if it were correct.
I think it should be a criminal matter, or at least such a dereliction of professionalism that some sort of ejection of the perpetrators happens.
Cummings pressured SAGE scientists to accept lockdown.
https://www.ukcolumn.org/article/covid–19-big-pharma-players-behind-uk-government-lockdown
I am a retired academic who has spent his career using large scale simulation and specialising in parallel computing (I was co-author of the first book in this field). If a code is threaded or has multiple processes, i.e. able to run on more than processor then it is absolutely unforgivable to not investigate and correct the code if it is not giving deterministic results. Even running on a single processor there is no guarantee of determinism. Non-deterministic results come from a lack of synchronisation, i.e. one process/thread reading from another at the wrong time. In effect variables can be read before hey have been correctly set. In an iterative solution there may some sanity to the results but qualitatively they are meaningless. I.e. although the form of the results may look OK there is no guarantee that they have any meaning whatsoever. In the worst case they are just so much garbage.
It should be noted that there are algorithms (simple iterative ones) that can give correct results without synchronisation but you need a mathematical proof of convergence before these can be trusted. Judging by the code this would be next to impossible in this case, so my conclusion is that the results are meaningless, even when averaged. If you average garbage, you get garbage!
On another matter, it is entirely possible to have academic teams that write clear, correct and well documented code. It requires good funding, and that Furgeson had, and it requires a mixture of professional software engineers or well educated software engineering graduates together with domain specific scientists. I know this because I have led such teams.
Writing ‘thread-safe’ code is critical in multi-thread or concurrent process systems. In the late 80s, I did a solution for a nuclear power station running under Concurrent-DOS, and had to implement application-level critical section semaphores to guarantee processes didn’t incorrectly interact. With Ferguson’s quality of coding, the station couldn’t have operated. Bad coding is dangerous.
Have you seen what it does not give deterministic results? With all your undoubted credibility on the subject, you should take a bit of time to look precisely at what was said…and then come back to see if the ‘seed’ issue that was raised is actually a problem or not
I didn’t see anyone talk about non-determinism, except incorrectly. It looks like the word was bandied around rather cynically. What I saw was a bug in how seeds were stored. Very different animals.
Most of us know now that non-determinism comes from lack of synchronization. It would be nice if you addressed the actual issue raised though.
To paraphrase the relevant comments in the text above… This issue was raised by Edinburgh, Imperial said to run it on one core, Edinburgh further said the results were not deterministic even on one core. Hence my comments above.
Yes the issue of the seed is another problem but probably more easily rectified.
Ah, but if read deeper on it, the reason that it was ‘non deterministic’ on a single core was down to a bug in saved states. They saved the program state and when reloaded, people thought that it should produce the same answer as the previous run – and assumed that it not doing so was non-determinism. In fact all that was wrong was that the seed was stored incorrectly in a recent modification made (i think for the Microsoft update). So it was this stored seed that caused it give a different answer (if you ran it again it gave the same answer as the second run). That’s not nondeterminism. It’s just an error in the way the seed was stored
Moreover it doesn’t affect the final answer, as the answer are statistics from an ensemble. All you need to generate the ensemble any many independent runs – which was never compromised by the seed/’nondeterminism’ bug.
And when informed that it wasn’t deterministic in SP mode, the model team said “oh, it is a bug!”. As it turned out the way the development team ran it for testing and validation in SP mode, the bug didn’t express itself. It was deterministic. Thus their first reaction in which they respondent clearly thought they were probably running it in MP mode.
And of course the bug doesn’t exist in the model used for the actual production projections given to government. It was a new feature added to the refactored C++ model.
As far as determinism in models run in MP mode, without full serialization (which in essence causes the MP to mimic an SP environment, with all of the execution efficiency issues this implies), determinism isn’t fully attainable. Certain levels of determinism are achievable and often cheap (i.e. in this case make sure the network that will be operated on is loaded before running the parallel simulation code), other levels aren’t. Since you’re an expert in monte carlo similuations you should know this.
Just a note that I did get an FOI request in for the original code before the Section 22 exemption could be used on it, so I will get it eventually.
https://www.whatdotheyknow.com/request/software_code_used_for_the_covid
I thought academics writing code had learnt their lesson after the UEA climate models, but it seems the Imperial College team didn’t got the memo.
One of the immediate outcomes of academics publishing code so people can see it, is that they’ll hear if they are using outdated, obsolete, and crappy methods. I am a programmer too, and I think coding is a craft, a skill that you master, like designing and installing plumbing. For example, if the professor bodged the plumbing and was able to hide it behind the plasterboard, you know it would be leaky and terrible. But if this work was on show he’d probably choose to hire a professional plumber, and it would be all right. There is no shame in a top professor not being able to do a decent job of plumbing, just as there should be no shame if he doesn’t know how to write and maintain code.
However, I take issue about this being a problem with the public sector and research. For example, the physicists seem perfectly capable of hiring genius programmers to write supercomputer software to simulate matter falling into black holes. In many areas of public science the software is amazing, given the time and skills of programmers who hold a vision of what should be possible. But here there’s not even a fantasy road map of what they’d want to do given the best programmers and an unlimited budget — the sort of thing you’d talk about in the pub with a stranger. This sorry state of affairs has come about because the powers that be have taken every measure they can to maintain their ignorance.
You assume what you claim to be seeking to answer (the ‘quality’ of the programming).
Fascinating insight into the modelling world. I do not claim to understand everything in the article and have used http://www.worldometers.info/coronavirus daily which provides me with detailed data which may well have incorrect information but at least shows trends.
Worldometers website details that on April 11 the UK had 344 recoveries and sadly 9,875 deaths.
May 8 records detail UK having 30,615 deaths, 206,715 positive cases and 175,756 active cases. Using these figures the number of recoveries is still 344 which would mean circa 90 deaths for a single recovery. This is surely incorrect and the reason for this appears to be that throughout the period between May 8 and April 11 the UK recovery has shown N/A. As a result the worldometer website has used 344 to compute the number of active cases which therefore are also incorrect.
The more successful the care the fewer the death toll. I would welcome comments regarding the use and importance of recovery statistics in COVID-19 modelling.
To be honest, it’s not much of an insight into the modelling world. It’s an insight into the mind of a database programmer’s comments on programming style. A bit like a grammar teacher raising an eyebrow at the lack of punctuation in the poet laureate’s latest work.
The number of recovered does matter for the epidemiological models of course – as it affects the number of people susceptible to illness. However, for some short term predictions used to manage current resources, I would imagine that you can get away with a simple non-dynamic version of (note: I’m a stochastic modeller, but NOT an epidemiologist). Moreover, people who are actively modelling for the NHS will have better access to data. They will not be dependent on Worldometers like the rest of us. Nothing unusual about that. If one were to really dig into the technical papers on this, and start to come up with credible ideas, you wold find more data to test those models on. Access to current NHS data is no double subject to data-protection. Makes sense as you don’t just some random person to have access to everyone’s medical and personal records.
As to why Worldometers is so slow in updating, no idea. But is does take a long time to be cleared of this virus – have a look at the records for China or South Korea for example to see how long it took for the ‘recovered’ figures to catch up.
“The number of recovered does matter for the epidemiological models of course – as it affects the number of people susceptible to illness. ”
Hmmm in a standard SIR or SEIR model I could swear that it is the rate of infection that drives the reduction of the Susceptible pool. Under the assumption of perfect immunity once a person reaches the R pool (which can be R or D, the meaning of D left for the reader to figure out), they never get fed back into the S pool.
But you are absolutely right that it takes a long time to recover from this disease with recovery meaning “not shedding viruses” i.e. not part of the infected (and infectious) pool (the I pool). Which is generally determined by a couple of weeks after symptoms or a couple of negative tests after symptoms end or the like.
The dead are easier to quantify.
You are quite right for a SEIR model under assumption of perfect immunity. I didn’t have the parameters or the model of SEIR in mind, so was being a bit sloppy and equating R with recovered but not assuming perfect immunity., so that, as in the real world there is (possible) feedback from R to S. I don’t know for a fact, but I’d be surprised if the Imperial model didn’t allow for imperfect immunity, even if it is not used.
The number of comments, the sheer variety of views, strongly held, here must surely indicate a complete lack of any widely held consensus regarding models and their utility with regard to forecasting…….so pretty much what so many thinks about weather forecasting: better to stick your head out the door, and better for you.
The indictment is that the government bet the country on that level of (un) certainty, and did that without having modelled the likely consequences.
“but the damage from over a decade of amateur hobby programming is so extensive that even Microsoft were unable to make it run right”
Talking about decades of amateur hobby programming, there’s “Windows”.
The start to finish argument is quite entertaining: “The code hasn’t been published so it can’t be independently checked, but it is full of errors. All independent research into epidemics must be stopped and replaced with the work of private industry using models and code that are secret and cannot be checked”
Isn’t the thought more that academics don’t in practice check each other’s work because mutually verified credentials are valuable and time spent checking each other doesn’t bring any career reward – whilst private sector firms betting money their models are right DO perform rigorous internal checks, even if they don’t publicise their models for academics to look at?
In other words, this is less about secrecy and more about incentives.
Some truth in that, but the reality is more complex. The numerical modes used are generally the academic ones in any case. Commercial developers tend to use packages like R or python libraries (scikit-learn) which are written by academics. If they are selling their code commercially, then the main point is to sell it…and for that brand, GUI, ease-of-use all outweigh the technical component in my experience. After all, you are generally selling it to people who are a bit less expert than the original developers, and they are usually not in a position to make the call on the deep technical side of things
Yes, the main point for commercial code is to sell it. Now, ease of use is certainly a factor, but if commercial *modeling* software did not give consistent results given the same input, then such software would be worthless. Under free enterprise, companies using such software would either purchase other software that’s (more) accurate or go out of business.
This is not to say that we live under free enterprise (we don’t). On the other hand, the concern about “credentialism” – that is, blindly trusting credentials – is that it all too easily leads to corruption. Processes like peer review and the scientific method itself are supposed to provide a check on credentialism, but they require transparency in order to function. It’s no surprise, then, that transparency is suppressed first and foremost.
For one thing, you seem to be attacking something of a strawman. You’re not accounting for the distinction between the original source code, which has not been made public, and the revised version that was made public. Obviously, since the revised version was made public, it can be independently checked for errors, and it has been. What cannot be independently checked yet is the relationship between the original version and the revised version.
Furthermore, the author has been consistent in arguing that, since the development of the software was essentially paid for by the public, it belongs in the public domain. Software developed by private companies, using their own funds, is under no such obligation to be made public. Private companies are also supposed to be subject to profit and loss, which would mean (all other things being equal) that they have a strong financial incentive to use models and modeling software that are as accurate as possible.
Yes. Who can take it seriously? It doesn’t read like the work of a software engineer to me. Certainly not a bright one.
I’m a software engineer with 15 years of experience in the industry. This article reads completely like the work of a software engineer to me. But you’re welcome to disregard this. After all, since you can’t verify my background, I’m sure I look to you like just another random person commenting on the internet.
I’ve been a software engineer too.
This article doesn’t contain anything technical in a granular sense – so it fails to detail any technical complaint (though it might look like it does to those unfamiliar with the field(s) ie the vast majority of people).
But then, nor does it do a competent job in the general nature of its issues. For instance, the claim that science must be naively replicable. Why? Mix milk into a cup of coffee and it will be different everytime.
Add the authors claims that “all papers based on this code should be retracted immediately,” and that “all academic epidemiology be defunded” then it really doesn’t look anything like professional, disinterested comment.
The author even labours thus:
“I’ve chosen to remain anonymous partly because of the intense fighting that surrounds lockdown”
and
“This situation has come about due to rampant credentialism and I’m tired of it. ”
Self-professed Expert criticises credentialism, shock!
If the author was concerned about the ‘intense fighting’ over lockdown, why are they contributing pieces laden with obvious though wholly extraneous political opinion that can do nothing except add to the ‘intense fighting that surrounds lockdown’.
Pull the other one.
I’m not sure what level of granularity you’re expecting here. Do you expect the author to do a line-by-line code review? Also, since you’ve apparently been a software engineer (I’ll give you the benefit of the doubt here), you’re welcome to review the code yourself to corroborate the author’s findings.
Technically you’re correct that the author does not *prove* their points using lines/sections of code. While I don’t know for sure, I suspect that the author didn’t do that because it would be lost on much (if not most) of the audience for their article. Nonetheless, I maintain that this article does sound like the work of a software engineer. One doesn’t need to post literal lines/sections of code in order to show that they know what they’re talking about regarding software.
I don’t see where the author claimed that science must be naively replicable. Your statement about mixing milk into a cup of coffee seems like an instance of the continuum fallacy, but I could be wrong. As I understand it, all stochastic modeling software relies on at least one input that’s random (if only in the sense of being mathematically free to vary). There can still be some variation in the output *given the same input* due to numerical integration, which is never completely precise. That variation should be “small” (this can only be relative to the model(s) being used). In other words, stochastic modeling software should give at least *similar* results when the same input is used. So the output should be essentially deterministic within the constraints of the input.
Are you saying that it’s impossible for “all papers based on this code should be retracted immediately” and “all academic epidemiology be defunded” to be professional, disinterested comments? More generally, are you saying that it’s impossible to make policy recommendations in a professional and disinterested manner? The logical consequence of that would be that all policy recommendations must be made by *non-professionals* and professionals must never question policy (as questioning policy is a kind of policy recommendation). This transforms professionals into glorified workhorses yoked to the whims of non-professionals, regardless of how absurd and counter-productive those whims may be. It would take situations like the following out of the realm of parody: https://www.youtube.com/watch?v=BKorP55Aqvg
You’re correct that, if the author wished to avoid controversy, then it does not follow for them to contribute anything to the controversy around “lockdown”. However, I don’t think that’s what the author meant by being concerned about “the intense fighting that surrounds lockdown”. I think the author chose to be anonymous in an effort to avoid *retaliation.* There’s clearly a difference between that and avoiding controversy altogether.
This is a junk article. If the best you can do is say they had bad code style, and they are “rushing” to fix the model based on new information like “contact tracing” you do indeed sound like a google engineer and not a scientist. Even the attempt to discredit the group using a stochastic model (yes you must be SUPER SMART by understanding that means random), averaging model outputs/runs is a sound way to get a more deterministic result. Please come back with some real criticisms, this could have been an interesting piece.
I think he addressed the lack of determinism issue more thoroughly than you are giving credit for. He stated that the same seed values would give disparate results, and this shouldn’t be the case. I agree that if this were not the case, then Monte-Carlo would indeed give disparate results upon each iteration, and that the long term averaging would be an appropriate way to process the output data.
“He stated that the same seed values would give disparate results, and this shouldn’t be the case”
Only if the model was run in single processor mode in a particular way (generating a network, saving it, then run using that saved network), which happens not to be how the team working on this iteration of the model was running it when they tested it. It was deterministic in their testing workflow using single processing mode, so they just missed it.
Regardless, the bug wasn’t in the model used to run the projections used by the government. That particular feature was added to the C++ version in github.
The fact that it is nondeterministic in MP mode is a red herring, and indication the author of this blog doesn’t really understand how monte carlo simulation works.
Based on the article, it does not seem accurate to say that the non-determinism only happens “if the model was run in single processor mode in a particular way”. Clearly the author claimed that the code is non-deterministic *in general* and provided a few examples to support that claim. They even said “I’ll illustrate with a few bugs.”
I suppose I will have to take a deep dive into the model and the code to see for myself what the appropriate implementation should be. These things can of course become quite complicated on a case by case basis, and as I stated I had not taken the time to do a deep dive on the github yet. My comment was merely a response to the author’s own comments- so at this point I guess all is I will have to check the model itself, but my understanding was that the author understood Monte-Carlo (he did mention it, after all).
Disclaimer: I couldn’t write a line of code to save my life.
That being said, having found the above eye-opening article yesterday, I wish to touch upon as aspect of Mr. Ferguson’s code that no one seems to have yet entertained.
Contrary to said code being inept, it is actually expert at helping to achieve its intended aim:
Cooked COVID-19 death stats https://swprs.org/a-swiss-doctor-on-covid-19/#latest ->fear-driven national economy-destroying lock-downs->this:
@JamesGRickards 2014
The Death of Money: Project Prophecy 2.0 – YouTube
https://www.youtube.com/watch?v=9lcNrJVIgb8&t=28m38s
@JamesGRickards now
25.03.2020
Get Ready for World Money – The Daily Reckoning
https://dailyreckoning.com/get-ready-for-world-money/
About James Rickards
https://www.amazon.com/James-Rickards/e/B0058M3XL8?ref_=dbs_p_pbk_r00_abau_000000
27.03.2020
Central Bank digital currencies gain allure amid a dollar spike – CGTN
https://news.cgtn.com/news/2020-03-27/Central-Bank-digital-currencies-gain-allure-amid-a-dollar-spike–PcdpziFE0E/index.html
What @scientificecon (Richard Werner) says:
https://twitter.com/scientificecon/status/1250039813090283522
Professor Werner – Official website
https://professorwerner.org/
07.05.2020
BoE warns UK set to enter worst recession for 300 years | Financial Times
https://www.ft.com/content/734e604b-93d9-43a6-a6ec-19e8b22dad3c
For more info, check out my pinned thread:
https://twitter.com/VI_XII/status/1250458523089154048
John
Bill Gates funds Fergusson directly and indirectly.
We don’t need one of Bill Gates’ dodgy vaccines because our immune systems learned a trick or two over the past 100 million years or so.
Indirectly for sure:
“$79,006,570”
March 2020
Imperial College London – Bill & Melinda Gates Foundation
https://www.gatesfoundation.org/How-We-Work/Quick-Links/Grants-Database/Grants/2020/03/OPP1210755
Germane to any discussion of Mr. Ferguson’s COVID-19 modelling software is that, aside from its contentious abuse as the first component of an applied Hegelian dialectic in justifying a near-global lock-down leading to the destruction of national economies resulting in a global recession primarily affecting the world’s urban population as supply chains, support services and law & order disintegrate in just-in-time food and fuel delivery-dependent cities, …
“Today, 55% of the world’s population lives in urban areas …”
16.05.2018
68% of the world population projected to live in urban areas by 2050, says UN | UN DESA | United Nations Department of Economic and Social Affairs
https://www.un.org/development/desa/en/news/population/2018-revision-of-world-urbanization-prospects.html
“… An economic breakdown is more than just economic. It leads quickly to a social breakdown that involves looting, random violence, fraud and decadent behavior.
The Roaring ’20s in the U.S. (with Al Capone and Champagne baths) and Weimar Germany (with riots and cabaret) are good examples.
Looting, burglary and violence in the midst of a state of emergency are the shape of things to come.
The veneer of civilization is paper-thin and easily torn. Most people don’t realize how fragile it is. But they’re going to learn that lesson, I’m afraid.
Expect social disorder to get worse long before it gets better.”
14.04.2020 (James Rickards)
Worst Recession in 150 Years – The Daily Reckoning
https://dailyreckoning.com/worst-recession-in-150-years/
…there’s also a private-public sector funded rush to fast-track the production of a now 20-year-old yet still unproven vaccine technology.
From Forbes:
“… When the genomic sequence of the virus was released online by Chinese scientists on January 11, 2020, the Cambridge, Massachusetts-based Moderna team had a vaccine design ready within 48 hours. It shipped a batch of its first vaccine candidate to the National Institutes of Health for a phase one study just 42 days after that. In early March, Moderna’s mRNA vaccine, which represents an entirely new way to provide immunity to disease, was injected into humans for the first time.
That’s lightning fast. Vaccines typically take years (or in some cases, decades) to develop,…
… The speed is made possible by a new technology: mRNA vaccines, … mRNA vaccines work kind of like a computer program: After the mRNA “code” is injected into the body, it instructs the machinery in your cells to produce particular proteins. Your body then becomes a vaccine factory, producing parts of the virus that trigger the immune system. In theory, this makes them safer and quicker to develop and manufacture, …
… The prospect that Moderna may have the technology to compress years into a few months and take on a virus that has crippled the global economy has investors salivating. …
… “If it works, we might have the best vaccine technology in the world,” Bancel says.
But that’s a big “if.” No mRNA vaccine currently exists on the market, and nobody knows for sure if the technology will work, much less against this virus. To date, nobody’s been able to make a vaccine that works against a human coronavirus. …
… Bancel isn’t the only optimist. In the past 20 years, there’s been an explosion of companies developing mRNA vaccines for a large swathe of diseases, and many have turned their attention towards the COVID-19 pandemic. German company BioNTech is working with Pfizer to develop an mRNA vaccine. Human trials have already begun. Another German company, CureVac, is backed by the Gates Foundation and is expected to begin vaccine trials this summer. Lexington, Massachusetts-based Translate Bio has partnered with French pharmaceutical giant Sanofi to develop its mRNA vaccine, with human trials expected to start later this year. …
… But it is still all theoretical—there aren’t any mRNA vaccines on the market for any diseases yet. When asked how we know mRNA vaccines will work, Drew Weissman, a researcher at the University of Pennsylvania School of Medicine who has spent 13 years studying the technology, answered bluntly: “We don’t.” There have been only a handful of human trials for any mRNA infectious disease vaccine, all of which have been focused on safety. There’s yet to be a trial showing mRNA vaccines are effective and long-lasting at preventing an infectious disease.
Scientists also don’t know how fast this coronavirus will mutate, which could affect how often a new vaccine will need to be created. If the virus mutates quickly, Weissman says, “We might have to make a new coronavirus vaccine every year or every couple of years.” …
… Nevertheless, the federal government is backing mRNA vaccines with serious cash. It has pledged to give nearly $500 million to Moderna alone for its COVID-19 vaccine. To speed development, the FDA has authorized both Moderna and BioNTech to begin vaccine trials in humans before safety-testing in animals was finished. …”
08.05.2020
Fueled By $500 Million In Federal Cash, Moderna Races To Make 1 Billion Doses Of An Unproven Cure
https://www.forbes.com/sites/leahrosenbaum/2020/05/08/fueled-by-500-million-in-federal-cash-moderna-races-to-make-1-billion-doses-of-an-unproven-cure/
I, for one, will decline the offer of any such vaccine, and if necessary, resist its legislated imposition on my person to the point of imprisonment in the confidence that when under-informed members of the public who naively agree to be afflicted by such quackery begin suffering ill effects in numbers too large to be ignored, I will have grounds for an appeal.
That mRNA and other related initiatives are funded by Bill Gates supplies further pause as to the intent involved:
“Taking their cue from Gates they agreed that overpopulation was a priority,”
May 26, 2009
Billionaires Try to Shrink World’s Population, Report Says – The Wealth Report – WSJ
https://blogs.wsj.com/wealth/2009/05/26/billionaires-try-to-shrink-worlds-population-report-says/
The now-broken Times of London link in the above Wall Street Journal article:
May 24, 2009
Billionaire club in bid to curb overpopulation – Times Online
https://web.archive.org/web/20110223015213/http://www.timesonline.co.uk/tol/news/world/us_and_americas/article6350303.ece
At 3:57 (cued) Bill Gates posits that carbon emissions can be curbed in part via a population reduction approach involving vaccines etc. (Note the audience response.)
29:32
Feb 20, 2010
TED
Innovating to zero! | Bill Gates – YouTube
https://www.youtube.com/watch?v=JaF-fq2Zn7I&t=3m57s
Engineering a major change:
The following brief, edited excerpt from G. Edward Griffin’s 1984 interview with Soviet KGB defector Yuri Bezmenov offers a succinct explanation of why so many people can’t grasp the true meaning of provable facts:
2:02
wimpb
Yuri Bezmenov was right – YouTube
https://www.youtube.com/watch?v=4MzQSmwkFNc
Former Wall Street investment banker Catherine Austin Fitts relates an occurrence from her time as Assistant Secretary of Housing and Federal Housing Commissioner at the United States Department of Housing and Urban Development that explains how some elected officials and business leaders can be induced to support any policy:
8:03
The Solari Report
Published on Jul 3, 2013
Solari Stories – Scandals, Control Files, and Blackmail – YouTube
https://www.youtube.com/watch?v=ITtvISpL0dY
Let’s use The Sims 4: Vampires for the next model. The vampires go out at night for adulterous encounters, feeding on the blood of others along the way.
IMHO, academic science has become really poor in recent years, driven by competition between researchers who cheat if necessary, and as Sue says, the awful tendency to protect colleagues who should be exposed. This ludicrous horror is the culmination of years of neglect. Back in 1975 someone I know worked on a a large international collaboration, and used a Monte Carlo model which was seeded from the date time. She only had the binary program, so debugging it was out of the question. She found that it would produce a hard fail due to a reference outside memory. She was told to just run it again and all will be well!
A friend of mine, Air Force veteran, MP and programmer, told programe, a saying: Good enough for government work!
As someone who is currently working in academia, the state of the original code base doesn’t surprise me in the least, and I think you touched on the main points to why that is.
I do wonder about the problem of determinism in this particular case – bear in mind I haven’t had the time to review the model myself. I am wondering if the issue is an inherent sensitive dependence on initial conditions in the model. These epidemiological models are sometimes highly nonlinear, and can exhibit chaotic behaviour. I’m wondering if the conversion of code to C++ caused this issue because the people who converted the code didn’t consider that certain parameters may be sensitive down to 10^-15 or lower. Maybe they used the wrong datatype which rounded off somewhere “good enough”, but didn’t realize that after a few iterations this can cause divergence of the result. This would explain why running with the same seed number can cause different outcomes. For anyone interested, this was actually how the idea of chaos was discovered back in the early 1960s:
https://www.latimes.com/archives/la-xpm-2008-apr-18-me-lorenz18-story.html
That is just my 2 cents. As for the result being different on different machines, I’m not too sure about that one. Possibly a compiler issue, due to different OS versions? I have nowhere near the software expertise that many of you have, but I have studied complex systems a bit so I figured I could offer that side up.
Oopsies. Sorry about destroying millions of livelihoods and kicking off a global depression. But hey, those are the Experts. To question them only identifies you as one of the undesirable dissidents. Comply or go to jail. This is for your own good.
Antibody testing in NYC leads to a calculated infected fatality rate of about 0.8%. In Geneva, Switzerland about 0.5%-1.2% This fits well with the original WHO estimate from February of an IFR of about 0.3%-1%.
R0, the basic reproductive rate, looks to be about 3, which leads to about 66% of the population to be infected for “herd immunity” to knock it out. Of course this assumes that being infected and recovering makes you immune for a lengthy period of time (we know the dead, of course, won’t be reinfected).
So a basic back-of-the-envelope calculation for the US yields:
326,000,000 x 0.66 x 0.008 = 1.7M dead (using the NYC data).
1.7M dead if no action (voluntary or government mandated) is taken to lower it.
So you don’t need a fancy model to get a seat-of-the-pants estimate that makes the need for action quite clear.
Even assuming arguendo that those numbers are accurate, there’s the question of what action is appropriate. Although I could be wrong, you seem to be implying a false dilemma by implicitly equating the highly vague phrase “need for action” with a highly specific policy (so-called “lockdown”). This is disingenuous, to say the least.
There have been so many advances in machine learning and tooling over just the last few years. Why are there not more developers creating models and testing them against what we now have as real world data? Why is it so many people rely on one model that has not been known to be accurate? There should be a Kaggle bounty for this problem if there isn’t already.
I’ve been programming for 35 years and believe it or not software can be very complicated and hard to understand – imagine that! Neither of those characteristics necessarily mean something is inherently wrong with the code. And guess what? Most code is very poorly documented.
I’m pretty sure 95% of the folks on here don’t know what the hell they are talking about and are looking for reasons to deny the scale of the COVID-19 threat.
Pretty normal with Academy. I used to work in “state-private” projects , I was on the private side, and every time I got some issues, their behaviour was like: “look, how you dare to say a PROFESSOR is wrong? Do you realize how higher he is in Academical rank?”.
No wonder if they produce crap.
I’ve been a professional software engineer for 24 years. This advertises itself as a code review but there is no evidence that the author actually reviewed the code or is capable of doing so. It consists in large part of inaccurate spin on github comments.
This isn’t to say the code isn’t crappy or that the development processes don’t leave much to be desired but that is par for the course in both public and private sector for a codebase of this age. I see no evidence that the code isn’t, within known limitations, fit for purpose. We go from one bug, quickly resolved, to the claim that the codebase is “unsalvageable” to the opinion that “all academic epidemiology be defunded”. Never mind that the author derives all his criticism not from his own work but from work done by … academics.
Frankly, I doubt the author has written a line of code in the recent past or has the expertise to properly evaluate the imperial codebase. The reasons given for remaining anonymous make no sense. I’m a “programmer that I know and trust” and I think the author is anonymous because he is trying to con you.
The most sensible comment I’ve read here. I’m sure the Imperial team will happily accept any help from volunteers in improving the model. The general principles upon which it is made are sound. Any other approach would be subject of much debate.
The comment about hotel’s being excluded was based on my own quick review of the code, however, the problems found by others during the closed period are sufficient to render the model non-replicable.
If you’d like more bugs then by all means read the followup part 2 analysis here:
https://dailysceptic.org/second-analysis-of-fergusons-model/
If you disagree that any of them are actually bugs, post why in the comments.
The code is “filled with bugs”, “deeply riddled with similar bugs”, “unsalvageable”. But the only thing you can come up with isn’t even a bug! It’s just a bit of poor documentation.
But worse than that you completely misrepresent the nature of the bug discovered by the red team at Edinburgh. It didn’t actually demonstrate non-determinism and was irrelevant to the runs which produced the actual predictions.
Your second article is even more full of BS. Changing the RNG seeds leads to different results! If you run the program with different input you get different output! One of the programmers can’t spell household!
One day you’ll look back on this global pandemic knowing that your contribution was to try to hoodwink people and to smear the scientists doing their best to understand the epidemiology. That makes me feel a bit sorry for you. But not much.
It’s clear from the article that the author was simply providing examples. They even say “I’ll illustrate with a few bugs.”
Of course, you’re able to review the code yourself and to (in)validate the author’s statements against it. Along with that, I’d invite you to support your own claims with reasoning and/or evidence.
The author gave examples of bugs that are easily demonstrated to have no effect on the result. Yet by posting on this site, the idea was clearly to undermine the results obtained – it is after all ‘Lockdown Sceptics’. IF the bugs don’t actually affect the result, then all that is left is a criticism of programming style.
Fine, we can all criticize style, especially when it is an older style. After all I’m truly shocked that Turing developed a machine that was conceptually single threaded AND used a strip of conceptual tape. What was he thinking? All those theorems he produced MUST be wrong.
That, I’m afraid is about the level of it. Is does not behoove Andrew Melmoth to review the code. It does behoove the author to find a real criticism before she starts whipping the politically naive into a frenzy.
Thank you thank you thank you
Hey, could u please elaborate how is this code/model related to the lunatic curve presented by Imperial College London? How influential was ICL-curve for decision making in UK and Europe? Is there any public info/link proofing that ICL-curve is based on this code u criticize? thx
Superb
The model will be nonlinear and therefore not numerically stable. Hence even tiny differences (as with the supposedly identical data in a different format) resulting in potentially very different results. This sensitivity is a feature of such models and one of the reasons you have to do multiple simulations and understand the range of behaviour. You need to understand mathematical modelling, numerical analysis and software engineering to understand whether some of these are genuine issues or not. The hotel bug is the only one that sounds like a genuine issue with the model to me, but all it does is add to the model assumptions that hotels are not considered and they are likely to increase the rate of infection anyway. Find an expert in scientific software to examine the code and then publish that.
You don’t seem to understand the issue. Putting the same initial parameters in and controlling the random seeds should result in the same outputs each time.
Although I’m certainly sceptical about the reliability of this model’s implementation, it is not necessarily true that “Putting the same initial parameters in and controlling the random seeds should result in the same outputs each time”.
Chaos Theory tells us that determinstic models can produce different outcomes from minutely different initial conditions. Okay, “but you set the seed so it should be exactly the same”. Well, no, errors in floating point arithmetic could compound and provide the minutely different conditions necessary for divergence under a chaotic system.
Again, I’m not defending this model’s implementation or its reliability. I’m only seeking to critique the argument that “Putting the same initial parameters in and controlling the random seeds should result in the same outputs each time”.
Academics have their place. But they do not live in the real world. The models are fundamentally flawed from the outset because they assume constant S (susceptibility). All disease is host response, i.e. immune competence and co-morbidities.
I am not sure the models can predict hospitalizations or fatalities. Only R0 or rate of transmission. Transmission is not virulence.
I wish I understood the differential equations better. I just know their are too many serious assumptions that are fatally flawed. In the end you pick the low, medium or serious outcomes (scenarios). That is a subjective and political calculation.
You’e correct to question modelling assumptions. However, I would also point you to the words of renowned statistician George Box:
“All models are wrong, but some are useful”
15k lines is a tiny program. it isn’t large enough to be doing anything significant. in short it is a defective toy.
I’d hate to hear your opinions on quicksort
Why does the model have to be deterministic and free of bugs if variance among predictions is much less than the error in the world it is simulating?
While the article expands a lot on the naming of the variables, the single file code, the mystics of the equations, the non-reproducibility of the simulation in a parallelized context, it sound like someone is trying to read too much from this type of simulation tool.
And somewhere within the anonymous article, we skipped the link between the undesired contribution of the imperfect implementation to the variability of the results, and the validity or invalidity of the conclusions drawn from the usage of the imperfect tool.
This is a simulation tool that express the dangers of the epidemic and illustrates the characteristics of exponential phenomena’s, not a crystal ball where you will read the number of death no even the magnitude, in an exponential process, a few days more a few days less is an order of magnitude already !
And there is not an implementation spec anyhow for a phenomenological model. The infrastructure for non-regression is not there, shocking ! The phenomenological model evolve because the specialist decided so. New implementation therefore starts to produce new numbers. Is it a regression or a correction ?
The only value of those tools is to illustrate all the aspects of the dynamics of the problem, in a model containing all the refinement that an epidemiologist can think of, for the better and for the worst, to illustrate the peculiars of the phenomenon to the politics who take decisions:
=> that when you delay the public measures by three days, you double the load on your health system,
=> that when you implement social distancing where the contagion ratio is reduced enough, then eventually the problem stops by itself.
When this process is understood, well the character font used for coding may be obsolete, the random generator seeding and coherency across parallel threads may be itself random, does it really mater, in the end ?
Seems to me like going to seek the advice of an absent-minded professor who works in a shambolic office, yellowing papers everywhere, mouldy bits of gunk in an overflowing waste bin. Does that affect the validity of his advice? Not necessarily – but it doesn’t fill you with confidence and might lead you to go and talk instead to someone who works in a more hygienic and less shambolic office. Or it might even lead you to wonder why you are seeking the advice of someone whose previous offices have had to be fumigated.
As you say @Musica 2014 ‘does it really matter? As @Caswell Bligh has said elsewhere – a model is just the rendering of assumptions. A tool that expresses the parameters and relationships that you have formulated. The ‘rendering’ bit is and isn’t important. It is important in the sense that the code must faithfully turn those assumptions and relationships into model outputs. To put it loosely ‘the numbers should add up’. Obviously. But is that where the focus should be? Obviously not.
As a PhD with publications employing dynamic stochastic optimal control as well as reams of path dependent dynamic numerical simulations, I can honestly say that this is shocking. Shocking. Shocking for several reasons
1. Where is the model written down that the computer programs are meant to codify? Research discipline dictates that there must be a paper that sets out the logical structure of any model that is being investigated
2. Where is the verification that the programs replicate a model with a closed form solution?
3. I can tell you that these epidemiological models are actually quite simple within the class of stochastic dynamic models that can be considered. There are maybe one or two vectors of key starting point state variables, a transition probability matrix with small dimensions and a parameter list that can be easily calibrated from basic data. A corollary to my assertion that these are simple models, there is no reason that the code should run to more than 300 lines of code in a simulation package such as Matlab, if that!
4. Items 1,2 and 3 imply that there should be NO circumstance where the same starting point gives a different result. It is customary to set the random number seed as a fixed parameter in the control program. Once this is specified the same random numbers will be drawn every single time the program is executed. There may be billions of random draws in a big simulation, but given the seed they will be exactly the same no matter what the program does.
5. If the programs do indeed provide different results depending on the time of day or the computer being used or some rubbish explanation then they have been written by rank amateurs and should not be relied upon for anything. The program has failed its basic role, that is, quantifying a theoretical model. Computers do not do anything other than compute so the same inputs must produce the same output.
6. That the output might differ by significant quantities of deaths from run to run suggests there is a convergence issue in the model (that hasn’t been articulated). This means that the programs are computationally inefficient. There are tricks that can be used to speed up the solution.
7. It was once said that the most dangerous research is carried out by a monkey with a statistics package. This seems to be worse. I would hire some professionals to start over
You’re making the mistake of taking what the author says at face value. The program didn’t produce different outputs from the same inputs. It produced different outputs when run in two different modes.
Papers are published in Nature in 2005 and 2006
As Andrew says below, this critique is bull. With your background, you should know to make an effort to do the research on it before making comments like this.
Read some of the comments by those who have looked at it here as well. Then go and actually read the github comments that she points to – but actually follow them through. Read the lot. You will see that this is all spin
Good work.
I agree with many of the sentiments expressed in this blog.
However, the suggestion that “all academic epidemiology be defunded”, is an extreme conclusion to arrive at from a code review. I’d be interested as to whether any other readers feel this position could be justified from the arguments presented in this article alone.
It was also sad to see the article begin with a fallacious appeal to authority:
“I worked at Google between 2006 and 2014, where I was a senior software engineer working on Maps, Gmail and account security. I spent the last five years at a US/UK firm where I designed the company’s database product, amongst other jobs and projects. I was also an independent consultant for a couple of years. “,
only to end with a critique of appeals to authority:
“This situation has come about due to rampant credentialism and I’m tired of it”.
Finally, I’m sure we all agree that policy advised by simulation should rely on robust, reliable and well-tested code. However, even though this code has been shown to display none of these qualities, it does not necessarily imply that it’s conclusions are “wrong”.
Thanks so much, Sue. I used this as my main source in a far more venomous write-up, linked to below, which I can get away with because I don’t expect anybody to quote me as an authority so I don’t need to pretend to be impartial or neutral in any way. I strongly suspect anybody who liked this post will enjoy mine as well:
https://medium.com/@allenfarrington/simple-truths-and-complex-nonsense-2e1c28ae6f29
I enjoyed it a lot, except for the bit at the end where you said you were grateful to Scottish politicians. That they are saving lives..? Did I understand that correctly? Were you being ironic and I didn’t get it..?
After a thoroughly enjoyable, vituperative piece, it left a bit of a saccharine aftertaste.
I wouldn’t read too much into that bit, I just found it funny, mainly because Sturgeon has indeed been trolling Johnson nonstop. The politicians aren’t “saving lives” so much as encouraging the government as a whole to act less like totalitarian thugs abusing power in the lockdown. I am very grateful for that because (as this website is testament to) England seems godawful from afar. Glad you enjoyed the rest
“Credentilism”? Kruger-Dunning much? This is just another “Why don’t they just…” letter to a tabloid, by someone who thinks 2y at Google 15y ago backs up their ramblings. FWIW _I’ve_ worked at a senior engineering level for a well-known US-based global megabank, and I think this stuff’s bollocks.
This is missing a key source: please can you share where the code that was released by Imperial has been published, so that we can see it for ourselves?
“sue denim” reveals a questionable agenda in her penultimate paragraph .. “defund all academic epidemiology”. This begs the question: is he/she a coder, a corporate biologist or simply some anti-lockdown activist.
The latter…. she is probably a programmer as well, but is absolutely refusing to engage with anyone who shows her comments as just so much fluff.
The right thing to do is to provide the model (not the code, the underlying model) to some good programmers and have them implement it. Then the results can be checked. That would also allow the model to be argued over by people who don’t understand the code.
The model has always been available in a pair of papers from Nature in 2005 and 2006 (if I remember correctly)
MAJOR BUG
The Pandemic modelling code has very poor file error checking
The file read function fread_big is called 10 times in SetupModel.cpp without any error checking
example line 46 fread_big(&(P.BinFileLen), sizeof(unsigned int), 1, dat);
but does not error check the return .
The call to this function could fail and the program would continue after reading non or only part of the file.
Reading only part of the data file would affect any use of the data,
The in house function fread_big is in file binio.cpp
This function does not error check ferror or feof so none or a part file could be read.
https://www.youtube.com/watch?v=GVpCWSx8a6I&feature=em-lbcastemail
Interesting take
Am now waiting for the climate change deniers/ Brexiters/ Y2k deniers to pile in …
Are you joking me? The thread is full of them…particularly the ones who come in and downvote anything that challenges the author’s continual exaggeration about the fairly harmless bug.
The more that modellers respond failing to understand the importance of repeatability given the same seed, the less confidence I have in modellers. Having had my fixes to other peoples DEs published, having fixed iterative models, and also been a software engineer, it is obvious that people who do not understand things shouldn’t be let loose with them. There is no random number generator in a typical PC. There is a PRBS generator. This was invented along with game theory, stochastic models, quantum mechanics and indeed the cpu architecture we use, by John von Neumann. A program however it is written (unreadable usually indicates lack of understanding by the author, seemingly simple is evidence of a master) if given a fixed set of inputs and it is not dependent on the state of any external variable, should give the same results. If not it has horrendous bugs. It’s a matter of correctness of the model. You cannot asume averaging multiple runs will correct and not compound these errors.
The model is either correct or it isn’t. I think the model needs to be corrected and all work based on it republished. There is a chance that the same recommendations will arise but you cannot assume that. Fix it and rerun all analyses as a matter of urgency. Especially now that horrendous bugs are apparent, all of this has to be public for confidence building.
You could probably estimate the results based on aggregating a series of gompertz functions.
Actually, modelers are responding saying it WAS repeatable at the time it was run. The bug came later and is only associated with restart states using the random number splitter. This was not used by the imperial team as far as the documentation shows in the github discussions.
Secondly, the new answers seem to give the same results as the old answers….
The more comments I read like this, the less confidence I have that people make an effort to do their research before jumping to conclusions that are dangerous or that fit some political preconceptions
This article misses the point of implementing and running these kinds of models. Its not about accurately trying to predict the future!! Things are far too complex and chaotic.
You don’t need a computer program to tell you that a lockdown was required.
Its a highly infectious (~10x more than ‘flu), has a long incubation period, is asymptomatic in the young and kills a decent percentage of the old and middle aged. Its a perfect storm and would overwhelm our healthcare system leading to far higher percentage of deaths of people in the middle age. Given enough time without major social distancing and lockdown we would all get it. This was about saving the people who have chronic health conditions but could still live for another 30 years if they can be supported while their bodies fight the virus.
If we hadn’t closed the schools then this would have infected everyone pretty quickly. This sort of model allows you to probe what effect something like keeping the kids at home does to the spread.
I was linked to this forum from Spectator.
While i am not great programmer (albeit starting with PDP11) , I spent many years working in network security.
What puzzles me, from programming perspective, is total lack of interest by anyone in government and civil service including Security to evaluate prof Ferguson work.
Even hardware installed below Top Secret level was in my days assessed by relevant departments.
We might have issues with process like common criteria etc, at least there was some attempt to analyse the code.
Here we have code which is considered poorly written even by people providing explanation as to why it is so, being used to justify measures costing country hundred of billions economically without anyone even checking it properly.
We have people who claim to have written over 2000 lines of serious code in 24 hrs saying that evaluating it would take too long, when whole programme is 15000 lines long.
However, my main concern is not so much code but modelling and conclussions derived from it.
Many people mention previous attempts by prof Ferguson to model epidemics with actual autcomes being completely different from model predictions.
What about Swedish University using prof Ferguson model to predict 40k deaths by 1st of May (actual outcome 2700)?
How can this guy be considered credible by anyone with an ounce of common sense?
To all this people who claim we are heading for dissaster unless this lockdown is maintained.
Have you seen actual death distribution by age against population distribution on ONS website (or same data for Sweden)?
Basically, only few hundred people below the age of 45 died WITH corona.
Even including people below age of 65 we are talking about few thousands deaths.
How can anyone in his right mind describe this as pandemic?
As you guessed from my “PDP11” bit I am of retirement age.
I have no children but still think that destroying future of young people to save lives of old people, many of them obese, diabetic and with existing medical conditions who would have died soon anyway, is total madness.
I am not usually given to conspiracy theories but I just hope for the sake of my sanity that there is something more to it than of the record response:
“it had to be done, because if prof Ferguson was even partially right in his predictions, Tory party would be finished for ever”
So 200 (and counting) billion bet on bit of software with very dubious accuracy record.
Well done, drinks all round (especially in Beijing)….
I agree that more checks and bounds need to be put in place in academia, but this implies MORE funding not less. As it stands, academics are so poorly underfunded that they are expected to be all things to everybody: programmers, managers, secretaries, grant writers, etc., all the while retaining their integrity as a scientist. Of course some of those roles are going to suffer — to expect an epidemiology professor to also be a perfect software engineer is paramount to expecting you, a software engineer, to perfectly understand the science underlying the code. No one person can do everything, and I absolutely agree that we should strive to get software engineers and scientists working together to solve these problems. But we should also strive to eliminate bias while doing so, and your proposal to defund academia will not achieve this. Insurance companies — by their very nature — are heavily biased by the profit motive. Insurance premiums effectively amount to a tax on society: the difference between the premium and the true level of incidents ultimately becomes revenue, which allows them to pay employees (software engineers included) but also profit for shareholders. If you defund academic epidemiology, who will audit the insurance companies to check they aren’t consorting to misrepresent the truth? Yes, academia also acts like a tax on society, but one that comes with checks and bounds to minimise bias and ultimately aims to find the truth. In cases where the scientific process is shown to be imperfect, as in this one, the solution should be to improve the process rather than to throw the baby out with the bath water. We should pay for publicly-funded support staff (software engineers, managers) who can help those who know the most about the science (epidemiology professors) get on with what they do best: science.
@Cooper Smout: Well, who is auditing the academics??? No one, obviously, so your argument does’t hold up.
ANY institution that provides information and opinions to make public policy needs to be auditable. I don’t care if that’s the academy, the private sector, or magicians. No hiding behind the curtain.
I think that from now on academics who publish research on the basis of modelling are going to have to show that the model works. I assume that there are standard benchmark tests to which model can be subjected to show that it is basically competent? I have seen nothing in the discussion here about any such standard testing, but I assume that it exists…?
So … I’ve been writing trading / algo trading systems in investment banks for over 30 years. Here’s my take on the problem.
In the real world things happen in spacetime. That is to say a virus has a lifecycle (ie created, spawning, death) and that virus has interactions on objects and in people inside different contexts which and the virus can only affect human health. (ie a human lifecycle of healthy, mild-illness, severe-illness, death)
You can model a virus lifecycle by breaking the model down into contexts, eg a home, hospital, care home, car, train that contains a number of people and objects with different effects on the virus lifecycle.
You can have different modelling scales. So you would have three scales of days, hours and minutes.
You can get performance improvements by running contexts once with the same inputs. For example a person in a home with a specific set of values needs only be run once for all homes. Once the output had been gained it would not need to be run again. You can also run contexts in parallel for each timescale.
Every iteration would update the contexts, people, things and virus. Only contexts with living viruses would need to be run. For example in iteration 1 then 2 homes might be added to the run list and 1m homes not calculated. Contexts with the same set of characteristics would have a count on them. For example 100,000 homes with 2 children, 1 adult, mild-illness.
Deterministic, clear software.
I think you’ve misunderstood the IC team’s responses where they excuse the non-determinism bugs as non-issues.
Stochastic models are intended to be run many, many times and the results aggregated. So long as any non-determinism introduced due to platform, compiler, multi-threading, or pulling additional values from the RNG before beginning the simulation (as was the case for Issue 116) is itself sufficiently random, i.e. it doesn’t skew the results in one particular direction, then it’s more or less meaningless. Over many thousands of runs, the contours of the aggregated data should be the same.
You have no idea what you’re talking about. The program can be stochastic, yet deterministic – the pseudo random number generator can be/is seeded with the same number at the start of the runs in order to repeat the pseudo random sequence on demand.
But an uninitialised variable can be holding *any* value. So on the first run you might get 100 billion people dying. On the second 3000 billion. And on the third -3.06. Is that OK? Or maybe the uninitialised variable always gives you between 999 and 1000 billion dead.
How many runs do you think you’d have to do to average away the error? And how would you know you’d ‘averaged it away’? That isn’t how averaging and randomness works. Computer memory isn’t filled with ‘noise’ that results in values that average to zero – that would be an amazingly naive view. If it’s uninitialised you have no idea whether it’s random or biased or what it is.
But IC explicitly say that the issue in question causes variations within the normal range, and not differences of a factor of 30.
Prove it. Every time. On every computer. You’re proposing to observe a computer as though it is part of the natural world. It isn’t and you don’t have to. You merely have to initialise the variable! If you don’t, the result is anyone’s guess. For sure, depending where it is in the model, it may just result in a tiny error even at the extremes of its range. If you can work that out and demonstrate it then that’s good. But it would have been a lot easier not to have made the error. If there’s one such error, there are probably others.
It’s like a publisher saying to an author: “We use a printing company that’s very good value, but their printing press occasionally makes small errors. On average it produces about ten random characters in a book. Most of the time, people don’t even spot them. So is that OK with you?”
Sure, most of the time. Until the day it creates a libellous sentence or whatever.
Anyone, who has done any real research into this pandemic, particularly listened to those who have worked on the frontline…
Virologists, Doctors who have A&E experience, with infectious diseases, and/or a great deal of surgical experience, not those who last worked in the frontline, 30 years previously, will know:
A. This pandemic has been planned
B. It has been supported by biased, bought and paid for media.
C. That it is only a serious illness for those whose immune system is compromised,
D. That the end goal is totalitarian control of the whole planet.
Dr. Judy Mikovits, who worked on HIV, EBOLA, and whose work, kept Magic Johnson alive, is also a whistleblower, and was jailed for failing to hand over her research data.
Wuhan Co-vid 19? Was it a Bio-weapon? .
https://youtu.be/3bXWGxhd7ic
This clinical study seems to suggest it was…
https://files.catbox.moe/n36xny.pdf
http://Www.bitchute.com
Search for Dr. Judy.
Look and Learn…
I wrote up a blog post that puts the above into a larger philosophical context that should make the discovery more amenable to people with less technical experience. https://medium.com/@bblfish/open-source-and-covid-19-models-5e638f785514
I hate to roll a bomb into this discussion but I think the author’s conclusions are likely just as valid for the climate models.
The author’s conclusions are wrong, and also don’t apply to climate models. The one I work with has a team of more than 10 software engineers testing, validating and code reviewing it.
I understand you’re an experienced software engineer, and I know that scientists write terrible code and their ignorance of engineering practices is often frustrating. But the thing is, working on “Maps, Gmail and account security” and “the company’s database product” doesn’t give you any experience working on massive numerical simulations. This post, to those who understand such things, is just a long confession of ignorance.
There may be bugs in the code, but non-determinism isn’t a bug. Having replicability given a pseudo-random seed is very helpful for automated regression tests and for debugging, but is not what we mean by “scientific” replicability. All scientific experiments in the real world are non-replicable; you will never get the same results if you run them again. Scientific claims are statistical, based not on experiments being deterministic, but on measuring the variability and computing your confidence that the outcome is within some distance of the “true” average result.
Neither is there any problem with writing algorithms that use their own outputs as inputs, though I understand why you would think so. Most Western culture has been based on “foundationalist” epistemology, which claims that you must begin with some assumptions, intuitions, or divinely revealed truths, before you can proceed to deduce things. This epistemological attitude is beaten into us by our religions, art, culture, and literature, and most Westerners imagine that you need to start with axioms or inputs, like in geometry. But empirical studies have always rivalled foundationalism, and we’ve proven conclusively in just the past 50 or so years that foundationalism just isn’t true. Revising values incrementally works great nearly all the time if you know what you’re doing and you choose an appropriate step size. I’ve used it myself many times, and validated the results as correct against empirical data. Look up “Gibbs sampling” on Wikipedia for probably the simplest (tho still not simple) explanation of how and why it works.
(Those of you who’ve studied philosophy might object that what I’m saying implies that all Western philosophy back to and including Plato must now be thrown out with the trash. Well, not entirely, but–yes, mostly. Certainly Plato.)
I agree there /is/ a big problem with “trying to calculate R0”, which has been responsible for most of the hysteria. As you said, R0 is not a property of a virus. There are 2 problems with its use on covid-19: First, it varies wildly from place to place with this virus. Second–and this applies to all diseases–people calculate R0 by studying outbreaks, and then apply that value to the virus in other locations. But outbreaks have already been selected for having extremely high R0. Trying to estimate R for the virus in a randomly-chosen location by looking at estimates from outbreaks is exactly like estimating your chance of winning the lottery by interviewing lottery winners.
I’m sure the climate change models are as fully reliable as this one. The fundamental problem here is not that the model is crap from a programming reliability perspective, but that it’s impossible to create a meaningful model for something with so many variables and such a vast amount of data. This is exactly the problem with climate change. We don’t have computers anywhere near capable of processing the information involved and producing a reasonable result. But, because of human hubris and desire to make money off of giving answers, answers are given. It’s basically snake oil writ large. Does this mean modeling doesn’t work? No, of course not–machine learning is excellent at solving relatively simple problems such as pattern recognition (of course it’s only recent taken huge strides in this). Someday we may have supercomputers that can handle the amount of data needed to model something like this (although acquiring the needed data may then be the problem) but we are nowhere near that today. Human reasoning and a phased iterative approach to response to a situation are actually a better way to avoid catastrophes like the one we’ve just visited on ourselves unnecessarily. What we really need is a way to analyze the feasibility of using models in a situation based on compute power and variables so we don’t use models where we shouldn’t.
The state of the code is one thing. The fact that it did not get a proper peer review another. But the real thing to consider is how much more could be done by building an open source platform to run such models, that would make it usable internationally, be able to build in local particularities that we have found to make such a difference in how the virus spread, and so have a lot of engineering talent able to participate in its development.
I develop this here a little more at length https://medium.com/@bblfish/i-agree-that-since-a-pandemic-in-this-case-covid-19-is-not-only-a-biological-phenomenon-233ddca3352a
‘Sue Denim’
C++ 30 years also.
Have you ever thought about Mensa?
https://www.youtube.com/watch?v=UvLQMMaVmzU
This review is by a software engineer who very clearly doesn’t understand large scale stochastic models. They are not the same profession. The “review” is biased and misleading but then you knew that because of the site name. The Monte Carlo approach of the model version referenced and the confidence intervals provided reflect the model uncertainty. It is common in large scale simulation models to have one file representing the models equations – it’s not email security software which the author worked on. The equations don’t care what language they are written in – after the code is compiled all code runs the same way.
I wonder if the compiler caused any further issues?
You can set the compiler flags for various things and I wouldn’t be surprised if some flags are determined to be critical in modelling situations at the expense of speed.
This is both fascinating and appalling. I also have run models using random number generators, and am very familiar with the concept of using the same seed to produce the same results. Initially I skim read the article and couldn’t understand how even on a single core it gave different results with the same seed. I came to the conclusion that there must have been a separate stream of random numbers elsewhere in the program that wasn’t seeded (and presumably would have taken the seed from the clock time, which is the normal thing). Then I re-read and found that is exactly what you described!
One point I would make. I agree that a bug would not necessarily produce random errors but might introduce systematic errors that would not average out over multiple runs. However, this example, with the accidental re-seeding, would of course introduce only random differences, which would indeed average out over multiple runs.
I guess the real issue is how far out the predictions are – and how much of the difference is due to programming errors, and how much to correct programming but with incorrect underlying assumptions. It is sobering to note that the same program predicted 96,000 deaths in Sweden with their current policy, and that the total is 4,700. But if they fixed all the bugs in the code so it was absolutely reproducible, would it still predict a huge number of deaths? I have a feeling it probably would, and that the reason is that the assumptions behind the model might be wrong, however correctly the model was programmed. Whichever way, they clearly got it vastly wrong in Sweden, and if scaled up for our population, that is roughly equivalent to the 500,000 UK deaths predicted with no lockdown.
Thank you for a very perceptive and useful analysis. For what it is worth, a few additional thoughts. I am completely astounded at the complete lack of understanding of the acceptable use of models by persons in positions of authority. I hesitate to call them leaders. Equally astounding is how poor at modeling the entire field of epidemiology appears to be. Are our education systems really so poor that almost no-one appears to understand that a model is at best an hypothesis and more often pure speculation, until it has been evaluated and tested using rigorous and valid logic. A major aspect of this testing is openness and public scrutiny. So far as I can tell, none of the models have been purpose built. Many appear to be modifications of standard textbook epidemiology. Most also seem to describe the dynamics of the virus-human system very inadequately. What politicians and their advisers seem not to realize is that when a scientist uses the output of an untested, unvalidated model to make predictions, especially if that model has not been exposed to public scrutiny, he or she is doing nothing more than expressing an opinion. It is not a scientific opinion any more than it would be if the local parish priest expressed his thoughts on the same matter. Educated guesses, even when you use a sophisticated calculator, are not science. When politicians then use these opinions as a basis for decision making, they are basing those decisions on logical fallacies, in this case appeals to authority. The claims that the rationale for lockdowns is based on science, made by everyone from the WHO to the leadership of most countries is therefore complete rubbish.
I’m a computational biologist, not an epidemiologist. I work in biotech now, but have done time in the academy. Prof. Ferguson has my sympathy because very likely he has spent thousands of hours writings grants begging for funds to support robust code — a savage struggle for resources that the computational soldier loses every time, flashy robots and mass spectrometers always winning out over the added programmer FTE. I would be surprised if there were even one dedicated developer on that team. Hard to hold an agile scrum in an empty cubicle.
Maybe, maybe, but then the worthy professor should stop pretending that his house is in order.
“On a personal level, I’d go further and suggest that all academic epidemiology be defunded.”
Interesting. I came to the same conclusion (Peerless Reads, YouTube) based on the following:
Because one type of model in a small field of epidemiology, all epidemiology is worthless? Clearly you’ve not done half the research you think you’ve done.
I joined to respond to this statement because the rest of the analysis is very good.
To arrive at that conclusion from one model is extreme and makes precisely the same mistake as SAGE, namely relying on a single model. The model in question is that this one example of an epidemiological model exhibits very poor coding standards. It’s very wrong to extrapolate that to all epidemiological models.
There are lot’s of very talented and experienced computer scientists and developers from commerce and industry who are now teaming up with epidemiologists and geneticists to exploit the vast computing power of relatively new massively parallel computational architectures. To throw them all under the bus with Prof Ferguson’s team is a terrible conclusion to an otherwise very good analysis.
So…based on a garbage model from a guy with a history of garbage models, and a lockdown “protocol” developed by a teenager and adopted over the objections of experienced epidemiologists, and the rejection of HCQ treatments for early COVID symptoms because “OrangeManBad”, we have devastated 10’s of millions of lives and possibly killed 10’s of thousands of people.
We might just as well have employed spell casting and Voodoo dolls.
Never again.
All too late, we are where we are and all that needs to happen is for a change of Tory Govt Leader, for him to DECLARE in no uncertain terms the fake Covid pandemic scamdemic is well and truly over and the UK is now returning back to OLD NORMAL, is ALL Covid19 secure plastic paraphernalia to be BINNED forever! Free at last Free at last! UK Plab back to business back in business everyone back to OUR NORMAL LIVES! Ah I can but dream!
Well researched essay and thanks. Time for Boris Govt to be made accoubtavble for betraying and failing the country.
Thank you….unforgivable….pride….academic pride affecting us all
“This bit of code highlights an issue Caswell Bligh has discussed in your site’s comments: R0 isn’t a real characteristic of the virus. R0 is both an input to and an output of these models, and is routinely adjusted for different environments and situations. Models that consume their own outputs as inputs is problem well known to the private sector – it can lead to rapid divergence and incorrect prediction. There’s a discussion of this problem in section 2.2 of the Google paper, “Machine learning: the high interest credit card of technical debt“.”
Can somebody explain what they mean here? The google paper is talking about real life feedback loops for machine learning (something regularly taught at any university I might add). This would only apply to this model if it didn’t counter for this (which it does by the addition of new features for the new control measures(wise or not)). If the input R0 values are priors this isn’t even relevant (as long as sensitivity analysis is given. Have I misunderstood?
I only just found L/S/ form reading <<< https://www.thetruthseeker.co.uk/?p=220897 >>>
The Future Shape of Things by Sebastian Friebel
I think the World is Fast Changing for the Better .
( Sorry about this Formatting , don’t know why … )
I don’t really understand Software , but I do know it is sometimes a real pain. When I switch on my ‘ Smart Phone ‘ (Only ever used for its Camera ! )
it takes forever to play annoying sounds, and be ready to use …
Denim’s Article here displays expertise, not too ‘ Heavy ‘, and gives simple Folk like me a better ubderstanding why idiots (?) such as Ferguson should spent time behind Bars…
( And , I don’t mean pulling Pint’s ! )
A Great Artice, much obliged Denim !
It’s not good if course, but none of these errors make a policy difference.
Thanks Sue – Geeks like this just tax my brain (not a coder but installed networks) I could only read half way through it and wanted to hurt someone lol frankly these types should just stick to writing code for online sports games,,,,
The Government have known that Imperial College modelling has been an utter scam since its inception in 2008. Every single model it has produced has been deliberately wrong. No MP has asked searching questions and they must know the history of Ferguson’s modelling. I do not trust any current MP. Steve Baker should have done his research and called this scam out at the beginning . Why didn’t he. Rees Mogg is an utter embarrassment to this country too. He knew what was going on and as a hard core Brexiteer would understand what we all did. The Bankers/Globalists needed to stop Brexit and Trump in order to bring in the Great Reset.
If these politicians were intelligent or patriotic they would know what is treachery and they would call it out. It seems to me they are no more than leftwing Globalist shills waiting for crumbs from the Piper.
written in C++
Ah, I think I see the problem …