Researchers have developed
DeepMet, a chemical language model designed to identify the "dark matter" of the metabolome by predicting the structures of
previously unrecognized metabolites. By training on textual representations of known chemical structures, the model learns
metabolic logic to generate plausible new molecules that existing databases often overlook. The study demonstrates that
DeepMet can prioritize these hypothetical structures based on their generation frequency and successfully match them to
mass spectrometry data from biological samples. This computational approach was validated by discovering
dozens of new metabolites in mouse tissues and human biofluids through comparison with synthetic standards. Ultimately, the tool provides a systematic way to
fill gaps in metabolic maps and improve the annotation of complex chemical datasets.
References:
- Qiang H, Wang F, Lu W, et al. Language model-guided anticipation and discovery of mammalian metabolites[J]. Nature, 2026: 1-10.