There follows a guest post by Daily Sceptic contributing editor Mike Hearn about the ongoing problem of apparently respectable scientific journals publishing computer-generated ‘research’ papers that are complete gibberish.
The publisher Springer Nature has released an “expression of concern” for more than four hundred papers they published in the Arabian Journal of Geosciences. All these papers supposedly passed through both peer review and editorial control, yet no expertise in geoscience is required to notice the problem:

The paper can’t decide if it’s about organic pollutants or the beauty of Latin dancing, and switches instantly from one to the other half way through the abstract.
The publisher claims this went through about two months of review, during which time the editors proved their value by assigning it helpful keywords:

If you’re intrigued by this fusion of environmental science and fun hobbies, you’ll be overjoyed to learn that the full article will only cost you about £30 and there are many more available if that one doesn’t take your fancy, e.g.
- Distribution of earthquake activity in mountain area based on embedded system and physical fitness detection of basketball
- Detection of rare earth elements in groundwater based on SAR imaging algorithm and fatigue intervention of dance training
- Detection of PM2.5 in mountain air based on fuzzy multi-attribute and construction of folk sports activities
- Characteristics of heavy metal pollutants in groundwater based on fuzzy decision making and the effect of aerobic exercise on teenagers
Background
Peer-reviewed science is the type of evidence policymakers respect most. Nonetheless, a frequent topic on this site is scientific reports containing errors so basic that any layman can spot them immediately, leading to the question of whether anyone actually read the papers before publication. An example is the recent article by Imperial College London, published in Nature Scientific Reports, in which the first sentence was a factually false claim about public statistics.
Evidence is now accruing that it’s indeed possible for “peer reviewed” scientific papers to be published which have not only never been reviewed by anybody at all, but might not have even been written by anybody, and that these papers can be published by well known firms like Springer Nature and Elsevier. In August we wrote about the phenomenon of nonsensical “tortured phrases” that indicate the usage of thesaurus-driven paper rewriting programs, probably the work of professional science forging operations called “paper mills”. Thousands of papers have been spotted using this technique; the true extent of the problem is unknown. In July, I reported on the prevalence of Photoshopped images and Chinese paper-forging efforts in the medical literature. Papers are often found that are entirely unintelligible, for example this paper, or this one whose abstract ends by saying, “Clean the information for the preparation set for finding valuable highlights to speak to the information by relying upon the objective of the undertaking.” – a random stream of words that means nothing.
Where does this kind of text come from?
The most plausible explanation is that these papers are being auto-generated using something called a context-free grammar. The goal is probably to create the appearance of interest in the authors they cite. In academia promotions are linked to publications and citations, creating a financial incentive to engage in this sort of metric gaming. The signs are all there: inexplicable topic switches half way through sentences or paragraphs, rampant grammatical errors, the repetitive title structure, citations of real papers and so on. Another sign is the explanation the journal supplied for how it occurred: the editor claims that his email address was hacked.
In this case, something probably went wrong during the production process that caused different databases of pre-canned phrases to be mixed together incorrectly. The people generating these papers are doing it on an industrial scale, so they didn’t notice because they don’t bother reading their own output. The buyers didn’t notice – perhaps they can’t actually read English, or don’t exist. Then the journal didn’t notice because, apparently, it’s enough for just one person to get “hacked” for the journal to publish entire editions filled with nonsense. And finally none of the journal’s readers noticed either, leading to the suspicion that maybe there aren’t any.
The volunteers spotting these papers are uncovering an entire science-laundering ecosystem, hiding in plain sight.
We know randomly generated papers can get published because it’s happened hundreds of times before. Perhaps the most famous example is SCIgen, “a program that generates random Computer Science research papers, including graphs, figures, and citations” using context-free grammars. It was created in 2005 by MIT grad students as a joke, with the aim to “maximize amusement, rather than coherence“. SCIgen papers are buzzword salads that might be convincing to someone unfamiliar with computer science, albeit only if they aren’t paying attention.
Despite this origin, in 2014 over 120 SCIgen papers were withdrawn by leading publishers like the IEEE after outsiders noticed them. In 2020 two professors of computer science observed that the problem was still occurring and wrote an automatic SCIgen detector. Although it’s only about 80% reliable, it nonetheless spotted hundreds more. Their detector is now being run across a subset of new publications and finds new papers on a regular basis.
Root cause analysis
On its face, this phenomenon is extraordinary. Why can’t journals stop themselves publishing machine-generated gibberish? It’s impossible to imagine any normal newspaper or magazine publishing thousands of pages of literally random text and then blaming IT problems for it, yet this is happening repeatedly in the world of academic publishing.
The surface level problem is that many scientific journals appear to be almost or entirely automated, including journals that have been around for decades. Once papers are submitted, the reviewing, editorial and publishing process becomes handled by computers. If the system stops working properly editors can seem oblivious – they routinely discover they published nonsense only because people who don’t even subscribe to their journal complained about it.
Strong evidence for this comes from the “fixes” journals present when put under pressure. As an explanation for why the 436 “expressions of concern” wouldn’t be repeated the publisher said:
The dedicated Research Integrity team at Springer Nature is constantly searching for any irregularities in the publication process, supported by a range of tools, including an in-house-developed detection tool.
The same firm also proudly trumpeted in a press release that:
Springer announces the release of SciDetect, a new software program that automatically checks for fake scientific papers. The open source software discovers text that has been generated with the SCIgen computer program and other fake-paper generators like Mathgen and Physgen. Springer uses the software in its production workflow to provide additional, fail-safe checking.
A different journal proposed an even more ridiculous solution: ban people from submitting papers from webmail accounts. The more obvious solution of paying people to read the articles before they get published is apparently unthinkable – the problem of fake auto-generated papers is so prevalent, and the scientific peer review process so useless, that they are resorting to these feeble attempts to automate the editing process.
Diving below the surface, the problem may be that journals face functional irrelevance in the era of search engines. Clearly nobody can be reading the Arabian Journal of Geosciences, including its own editors, yet according to an interesting essay by Prof Igor Pak “publisher’s contracts with [university] libraries require them to deliver a certain number of pages each year“. What’s in those pages? The editors don’t care because the libraries pay regardless. The librarians don’t care because the universities pay. The universities don’t care because the students and granting bodies pay. The students and granting bodies don’t care because the government pays. The government doesn’t care because the citizens pay, and the citizens DO care – when they find out about this stuff – but generally can’t do anything about it because they’re forced to pay through taxes, student loan laws and a (socially engineered) culture in which people are told they must have a degree or else they won’t be able to get a professional job.
This seems to be zombie-fying scientific publishing. Non-top tier journals live on as third party proof that some work was done, which in a centrally planned economy has value for justifying funding requests to committees. But in any sort of actual market-based economy many of them would have disappeared a long time ago.
To join in with the discussion please make a donation to The Daily Sceptic.
Profanity and abuse will be removed and may lead to a permanent ban.
AI checking what are obviously AI produced documents, and we wonder why the world is in panic. I’m sure Toby has been banging on about incompetence being at the root of this, seems to make the conspiracy theories look less and less plausible the more evidence that people like Mike (the author) uncover.
Jeez. Follow The Science. And to think of the number of papers I’ve seen over the last 18 months that challenged the dominant narrative but weren’t seen as authoritative because they hadn’t been peer reviewed. Clearly we have to find a way of separating science and the influence of big business aerobic folk teenage dance exercise.
Maybe, our guys should try playing the same game. I know they won’t because they value integrity but in reality this is war. The cost is the same; many lives and the destruction of our infrastructure and wealth.
You are drawing completely false conclusions, the problem is not peer review here, it is the LACK of peer review disguised as the process having occurred.
It cuts both ways. Clearly, these papers were not reviewed by an actual disinterested scientific peer. They may have been ‘reviewed’ by a peer that was actually a co-worker of whoever generated the fake paper.
At that point things get a lot murkier because it throws into light the question of what a “peer” actually is when talking about scientific publishing. That matters! Even when we may suspect that review of some sort was genuinely done – e.g. with papers coming from well known university epidemiology departments in the west – it shows that the process just isn’t taken seriously by anyone. How do these so-called “peers” get selected? The process is enigmatic. Why do they appear to so frequently sign off on papers containing false or nonsensical claims? No investigations are ever done, and when they do occur the result is merely a notice at the top of a paper saying “don’t rely on it anymore” which is the weakest responsible possible. No deeper post-mortem occurs.
Fundamentally, although individual scientists may take an individual peer review seriously, at an institutional level the process is not viewed as important. If it was, people would get paid to review these articles just like they do in every other type of publishing. If it was, horrific QA failures like those described in the articles would be investigated properly and there’d be some sort of outcomes – like firings or journal closures – beyond simply labelling obviously auto-generated articles as a matter of “concern”. We’re learning that the QA standards in science are absolute min-viable and often much lower, which is a huge problem.
On-topic: what you describe is similar to medics signing medical certificates of the cause of death. Only taken “seriously” insofar as “CYA” is serious.
“Peer review” has always been c*ck.
Think of it as akin to solicitors or surveyors or whatever scratching each other’s backs.
To form a view of how useful a journal article is, read it and do precisely that: form your own opinion. Never mind what institution the author is at, who funded him, or what score someone somewhere has assigned to the journal.
Most academic articles even if they’re not retracted are full of obscurantist lingo that makes you wonder how the author would get on if they had to have an ordinary conversation at a bus stop.
You want “centralisation”? Well look at the TOP journals in each field – journals such as “Nature” or “Inventiones Mathematicae”. Big decisions on what whole areas are considered suitable for research and what areas will never receive much funding are all made centrally. Every non-zombified person working in academia who has a bit of observational skill and a shred of decency is aware of this.
It’s a pyramid, and big business works hand in hand with the state. Mostly you can’t put a cigarette paper between them, to use Mussolini’s metaphor.
I think the root cause is empirically obvious!
A combination of greed and insurrection followed by subjugation of the people without civil war.
I.e. I think it is more than lazy science, I think it is intentional political manipulation.
Never ascribe to evil what can be explained by stupidity.
Someone famous said that, or something like it.
@ThinkHarder – behind “science” and “politics” there is economics – or, as it also known in this context, MONEY.
Aeons ago, I had a Basic program on my primitive computer that generated random phrases by combining elements from three different lists.
I didn’t realise that the program was still in use in high places. I could have used it to become a leading epidemiologist if only I hadn’t spent so much time playing Kong on the same computer.
What no one is following the science, shame on you!
Honestly, blaming the science for this is like blaming the firefighters for arsonists.
Really? Since when was firefighting an ideology?
Fascinating! Also gave me a good chuckle, I had never realised there was such a close relationship between Arabian geodes and Chinese Latin dance.
It’s called Cha-Cha-Chaos theory.
I expect Neil Ferguson was the author of the one featured – he’d probably had a tinker with the code he uses to try to do other stuff, as everyone now knows he’s a complete tosser when “predicting” the effects of infection – or maybe even he’s a bought crooked scumbag…
Has the world always been like this and I accidentally swallowed a red pill. Just seems everything is built on BS and there is no integrity and considered thought for humanity’s improvement.
I was always aware that there were some more selfish and greedy than the norm but so many …
The purpose of this web site is to deliver such bias. It’s been already observed in the 70’s and called “mean world syndrome” back then.
Shows how the quality and integrity of academic and professional journals has dropped over the last 25 years, and especially in the last 5-10.
Remember all those deliberate fakes (many of which were rather dodgy original docmuents – including the infamous one by a certain former German Chancellor but with certain words replaced with modern/woke ones) submitted by those US academics – to show up said journals?
Ironically the same academics have been under threat of been ‘cancelled’ by their univeristies ever since. It appears those working such such Establishment outlets don’t like having their lack of professionalism or integrity exposed in such a way.
A few months ago I completed a study for my Greek students (mostly retired or semi-retired) on the Greek translation of Lamentations from the Septuagint. I found lots of errors in the Greek, and compiled my own translation from Hebrew into Greek for about 20 – 30 verses. Sometimes I could work out what went wrong and why the error was made, but at other times I was at a complete (perhaps I should say compleat) loss as to what was going on. Certainly some of the translated verses made little sense.
After doing all this I am glad to say that my conclusions were verified by a top-flight scholar called B. Albrekston in the edition of the Biblia Hebraica Quinta series.
But perhaps this unknown translator in the first century B.C. or A.D. had stumbled on some method of research that has only recently been rediscovered.
Just a thought.
And this is just the tip of the iceberg. How many academics are just doing their “job” to have some good times at conferences and meet pals and eat out in interesting exotic places? Sadly, for any really relevant and well-done paper, there is probably tens of hundreds of others that were produced just because some not-so-clever fellow had to reach their publication quota.
Instead of firing incompetent individuals, the academia decided to implement non-working, corruptible policing systems (kind of like what we see now in handling of the pandemic). But that’s simply what happens then the inmates are allowed to run the asylum, and there is no easy solution for this.
There is an inherent conflict of interest for mediocre people who are devoted to their salary and perks rather than science to fight their own “peers”.
The filter that comes closest to solving it is the journal reputation and rankings. But since they are also lately based on irrelevant mechanical metrics such as number of citations rather than actual content – and since the publishers want to make as much money as possible – these can be subverted too. Ultimately the reasons are greed and moral decay permeating the entire world, including academia.
Amazing. This and real and compelling and important studies either don’t get published or don’t get “peer reviewed.”
I’m looking forward to the day when it’s finally going to be uncovered that circa 70% of all SAGE statements where – in fact – generated by an ExpOpGen program generating spoof expert opinons someone wrote for fun and that Neil Ferguson doesn’t really exist at all.
I mean 12 – 16 years old ought to be vaccinated against some virus in orer to improve their mental health? Does that look like a serious expert opinon? Or rather like world salad?
Peer-review of Isaac Newton’s one million words on alchemy: A substantial and wide-ranging contribution to our science.
Peer-review of Isaac Newton’s physics: This chap is totally wide of all standards of accepted science, this is utterly incomprehensible.
Scientist: (noun) A person engaged in pursuit of a grant, often from a State-funded or State organisation.
The “Death of the Scientific Method “ was highlighted in the then Lockdown Sceptic some months ago and the mantra of following the science is flawed. When the currency if academia is publications it is no surprise that following an accepted understanding be it pro mask, pro vaccination, pro lockdown is much easier than being critical. Just as with Mad Cow Disease, history will demonstrate over reaction based on misunderstanding and outlandish guesses from individuals “Ferguson “ et al- the difference with Covid is that in mad cow disease (CJD) it was cows that were sacrificed.
‘Peer Review’ – Richard Horton, editor-in-chief of The Lancet
“Peer review to the public is portrayed as a quasi-sacred process that helps to make science our most objective truth teller, but we know that the system of peer review is biased, unjust, unaccountable, incomplete, easily fixed, often insulting, usually ignorant, occasionally foolish, and frequently wrong”
And strangely no university librarian or member of the university committee that signs off the library’s funding ever gets sacked (or surcharged, or even publicly exposed) for handing over public money in return for this garbage.
If there wasn’t a demand, there wouldn’t be a supply.
If there weren’t kickbacks, there wouldn’t be a “demand”.