Researchers have developed
ChatNT, a groundbreaking
multimodal conversational agent designed to interpret and analyze complex biological sequences like
DNA, RNA, and proteins. By integrating a
DNA encoder with a
large language model, this system allows users to solve advanced genomics tasks using simple
English instructions rather than complex code. The model achieves
state-of-the-art accuracy across dozens of tasks, including predicting
promoter activity,
splice sites, and
protein stability. Its unique architecture utilizes an
English-aware projection to extract specific biological features based on the user's unique questions. Additionally, the study introduces a
perplexity-based method to verify the model's confidence, ensuring more reliable predictions for scientific research. Ultimately,
ChatNT makes sophisticated genomic analysis accessible to a broader range of scientists, bridging the gap between
artificial intelligence and
biological discovery.
References:
- de Almeida B P, Richard G, Dalla-Torre H, et al. A multimodal conversational agent for DNA, RNA and protein tasks[J]. Nature Machine Intelligence, 2025: 1-14.