Paper Talk

201-C2S-Scale: Scaling Single-Cell Analysis with LLM


Listen Later

The paper present a preprint detailing C2S-Scale, a new family of Large Language Models (LLMs) designed for next-generation single-cell analysis. C2S-Scale is built upon the Cell2Sentence (C2S) framework, which converts high-dimensional single-cell RNA sequencing (scRNA-seq) data into textual "cell sentences" that can be processed by LLMs, enabling the integration of transcriptomic and textual biological data at unprecedented scales (up to 27 billion parameters). The authors demonstrate that scaling these models leads to consistent performance improvements across diverse tasks, including cell type annotation, complex natural language interpretation, and perturbation response prediction using a novel metric called scFID. A key finding is C2S-Scale's ability to facilitate biological discovery, exemplified by a virtual screen that successfully uncovered a cytokine-conditional amplifier of antigen presentation, which was subsequently validated experimentally.

References:

  • Rizvi, Syed Asad, et al. "Scaling large language models for next-generation single-cell analysis." BioRxiv (2025): 2025-04.
...more
View all episodesView all episodes
Download on the App Store

Paper TalkBy 淼淼Elva