2021-06-24 08:08:16 Scientist Uncovers Deleted Coronavirus Data From China
Scientist Uncovers Deleted Coronavirus Data From China
Thirteen genetic sequences were mysteriously deleted from an online database last year after being isolated from people with COVID-19 infections in the early days of the pandemic in China. They have now been recovered.
Jesse Bloom, a computational biologist and viral evolution specialist at Seattle’s Fred Hutchinson Cancer Research Center, discovered that the sequences had been removed from an online database at the request of scientists in Wuhan, China. However, with a little internet detective work, he was able to recover copies of the data stored on Google Cloud.
The sequences do not fundamentally alter scientists’ understanding of COVID-19’s origins, including the contentious issue of whether the coronavirus spread naturally from animals to humans or escaped in a laboratory accident. However, their removal adds to concerns that Chinese government secrecy has hampered international efforts to understand how COVID-19 emerged.
Bloom’s findings were published on Tuesday in a preprint paper that had not yet been peer-reviewed by other scientists. “I believe it is consistent with an attempt to conceal the sequences,” he told BuzzFeed News.
Bloom discovered the deleted data after reading a paper about some of the earliest genetic sequences of SARS-CoV-2 written by a team led by Carlos Farkas at the University of Manitoba in Canada. Farkas’ paper described sequences collected from hospital outpatients as part of a project by Wuhan researchers developing diagnostic tests for the virus. When Bloom attempted to download the sequences from the Sequence Read Archive, an online database run by the National Institutes of Health in the United States, he was met with error messages indicating that they had been removed.
Bloom discovered that copies of SRA data are also kept on Google servers, and he was able to piece together the URLs where the missing sequences could be found in the cloud. He was able to recover 13 genetic sequences in this manner, which may help answer questions about how the coronavirus evolved and where it came from.
Bloom discovered that the deleted sequences, like others collected at later dates outside the city, were more similar to bat coronaviruses — thought to be the ultimate ancestors of the virus that causes COVID-19 — than to sequences linked to Wuhan’s Huanan Seafood Market. This adds to previous speculation that the seafood market was an early victim of COVID-19, rather than the site where the coronavirus first jumped from animals to humans.
“This is a very interesting study conducted by Dr. Bloom, and in my opinion, the analysis is completely correct,” Farkas wrote in an email to BuzzFeed News. Scott Gottlieb, the former head of the Food and Drug Administration, lauded the findings on Twitter as well.
Some scientists, however, were less impressed. “It really adds nothing to the origins debate,” said Robert Garry of Tulane University in New Orleans in an email to BuzzFeed News. Garry contended that COVID-19 could still be found in the Huanan market or other markets in Wuhan.
Bloom is one of 18 scientists who signed a letter in May criticizing the WHO and China’s investigation into the origins of SARS-CoV-2. The scientists claimed that the WHO–China report failed to give “balanced consideration” to competing theories that the coronavirus spread naturally from animals to humans or escaped from a lab — a theory that the report deemed “extremely unlikely.” Following the publication of the WHO–China report, the United States and 13 other governments complained that it “lacked access to complete, original data and samples.”
The deleted virus sequences were first uploaded to the SRA in early March 2020, around the same time that Wuhan University researchers led by Yan Li and Tiangang Liu published a preprint describing their work using genetic sequencing to diagnose COVID-19. Just a few days before, China’s State Council directed that all COVID-19-related documents be centrally approved.
The sequences were then removed from the SRA in June, around the time the paper’s final version was published in a scientific journal. The authors requested that the sequences be removed, according to the NIH. “The requestor indicated that the sequence information had been updated, that it was being submitted to another database, and that the data should be removed from SRA to avoid version control issues,” NIH spokesperson Amanda Fine told BuzzFeed News in an email.
It is unclear, however, whether the sequences have since been posted online in another database.
Bloom wrote in his preprint, “There is no plausible scientific reason for the deletion,” claiming the sequences were likely “deleted to obscure their existence.” That suggested a “less than wholehearted effort to trace the early spread of the epidemic,” he wrote.
Despite the fact that the sequences were deleted, Garry pointed out that the key genetic mutations they contained were still published in a table in the Wuhan team’s final paper. “Jesse Bloom discovered exactly nothing new that was not already part of the scientific literature,” Garry told BuzzFeed News, accusing Bloom of writing his preprint in a “inflammatory, unscientific, and unnecessary manner.”
Bloom emailed the Wuhan researchers, asking why the sequences had been removed, but received no response. Similarly, Li and Liu did not respond immediately to a BuzzFeed News inquiry.
This is not the first time scientists have expressed concern about the removal of data that could help answer questions about COVID-19’s origins. The Wuhan Institute of Virology’s main database containing information on coronavirus sequences, which has been the subject of speculation about a possible “lab leak” of the virus, was taken offline in September 2019. When members of the WHO–China team that investigated the pandemic’s origins visited the institute in February, they were told that the database, which allegedly contained data on 22,000 coronavirus samples and sequence records, had been removed due to repeated hacking attempts.