top of page
Writer's pictureMinhoo Jeong

AI Algorithm Discovers More than 160,000 New RNA Viruses

October 29 2024

By Minhoo Jeong



It is an exciting time to be living amidst a wave of artificial intelligence sweeping across the scientific landscape from physics through chemistry to AI-powered discoveries that are reconstituting our knowledge of the world in which we live. This year, Nobel Prizes in Physics and Chemistry were partly awarded for work connected directly to AI. Other important terminologies that are written all over our day-to-day activities since the COVID-19 pandemic: RNA viruses, where AI stands for in these cases. While with double-stranded DNA herpes, RNA viruses have the genetic material in a single-strand manner and are very unlike herpes, which has peculiar challenges for scientists.


A new study in Cell led by Chinese and Australian researchers headed by Dr. Xin Hou shows how AI and RNA virus research go hand in hand to benefit humanity. Using an enormous collection of genetic data from various ecosystems all over the world, LucaProt-an AI algorithm-was put to use to analyze more than 10,000 previously unexamined metagenomic samples. That unprecedented analysis revealed an astonishing 161,979 putative RNA virus species and 180 RNA virus "superclades." The researchers say these are only "a drop in the ocean" of the wider "virosphere"-a theoretical total scale of the viral world.


According to University of Sydney virologist Eddie Holmes, who also co-led the research, this study shows how much innovation has now gone into predicting protein structure and discovering such a diverse range of viruses. This AI algorithm is similar to AlphaFold, which just won this year's Nobel Prize in Chemistry for its correct predictions of protein structure. Holmes and his coauthors describe how LucaProt permits the sifting of "dark matter" in DNA. In genomics, "dark matter" describes DNA code lines that do not have a match in DNA data bases (unlike in astrophysics). Starting from metagenomic samples full of data from non-living origin sources, such as plants, animals, fungi, bacteria, and viruses, the method continues.


Its ability to learn makes LucaProt unique. In this case, a light is trained on the prediction of genetic information of the species for the RNA virus from the dark genetic matter. One marvelous discovery was done with metagenomic samples weighing as light as 50 grams in which over 1,600 new viruses were discovered near an agricultural research site located to the south of Sydney. Above all, it indicates that the scale of discovery underlines the huge potential of AI-powered exploration of the virosphere and its great contribution, with AI at the forefront, to the expansion of the frontiers in biological research. This work marks an important advance in the unraveling of RNA viruses in nature and presents broader implications for AI in fostering scientific discovery.


bottom of page