Deep learning is solving biology’s deepest secrets at breathtaking speed.
Just a month ago, DeepMind cracked a 50-year-old grand challenge: protein folding. A week later, they produced a totally transformative database of more than 350,000 protein structures, including over 98 percent of known human proteins. Structure is at the heart of biological functions. The data dump, set to explode to 130 million structures by the end of the year, allows scientists to foray into previous “dark matter”—proteins unseen and untested—of the human body’s makeup.
The end result is nothing short of revolutionary. From basic life science research to developing new medications to fight our toughest disease foes like cancer, deep learning gave us a golden key to unlock new biological mechanisms—either natural or synthetic—that were previously unattainable.
Now, the AI darling is set to do the same for RNA.
As the middle child of the “DNA to RNA to protein” central dogma, RNA didn’t get much press until its Covid-19 vaccine contribution. But the molecule is a double hero: it both carries genetic information, and—depending on its structure—can catalyze biological functions, regulate which genes are turned on, tweak your immune system, and even crazier, potentially pass down “memories” through generations.
It’s also frustratingly difficult to understand.
Similar to proteins, RNA also folds into complicated 3D structures. The difference, according to Drs. Rhiju Das and Ron Dror at Stanford University, is that we comparatively know little about these molecules. There are 30 times as many types of RNA as there are proteins, but the number of deciphered RNA structures is less than one percent compared to proteins.
The Stanford team decided to bridge that gap. In a paper published last week in Science, they described a deep learning algorithm called ARES (Atomic Rotationally Equivalent Scorer) that efficiently solves RNA structures, blasting previous attempts out of the water.
The authors “have achieved notable progress in a field that has proven recalcitrant to transformative advances,” said Dr. Kevin Weeks at the University of North Carolina, who was not involved in the study.
Even more impressive, ARES was trained on only 18 RNA structures, yet was able to extract substantial “building block” rules for RNA folding that’ll be further tested in experimental labs. ARES is also input agnostic, in that it isn’t specifically tailored to RNA.
“This approach is applicable to diverse problems in structural biology, chemistry, materials science, and beyond,” the authors said.
Meet RNA
The importance of this biomolecule for our everyday lives is probably summarized as “Covid vaccine, mic drop.”
But it’s so much more.
Like proteins, RNA is transcribed from DNA. It also has four letters, A, U, C, and G, with A grabbing U and C tethered to G. RNA is a whole family, with the most well-known type being messenger RNA, or mRNA, which carries the genetic instructions to build proteins. But there’s also transfer RNA, or tRNA—I like to think of this as a transport drone—that grabs onto amino acids and shuttles them to the protein factory, microRNA that controls gene expression, and even stranger cousins that we understand little about.
Bottom line: RNA is both a powerful target and inspiration for genetic medicine or vaccines. One way to shut off a gene without actually touching it, for example, is to kill its RNA messenger. Compared to gene therapy, targeting RNA could have fewer unintended effects, all the while keeping our genetic blueprint intact.
In my head, RNA often resembles tangled headphones. It starts as a string, but subsequently tangles into a loop-de-loop—like twisting a rubber band. That twisty structure then twists again with surrounding loops, forming a tertiary structure.
Unlike frustratingly annoying headphones, RNA twists in semi-predictable ways. It tends to settle into one of several structures. These are kind of like the shape your body contorts ...