Paper Talk

764-X-Cell: Scaling Causal Perturbation Prediction


Listen Later

The paper introduces X-Atlas/Pisces, the most extensive genome-wide CRISPRi Perturb-seq dataset created to date, featuring 25.6 million single-cell transcriptomes across 16 biological contexts. Leveraging this massive resource, researchers developed X-Cell, a diffusion language model designed to predict how genetic interventions reshape gene expression. The model improves accuracy by integrating multi-modal biological priors, such as protein language models and interaction networks, through a specialized cross-attention architecture. By scaling the system to 4.9 billion parameters in X-Cell-Ultra, the authors demonstrate that perturbation prediction follows power-law scaling similar to large language models. Ultimately, the research shows that X-Cell achieves superior zero-shot generalization in unseen cell types and primary human cells, offering a transformative tool for computational drug discovery and target identification.

References:

  • Wang C, Karimzadeh M, Ravindra N G, et al. X-Cell: Scaling Causal Perturbation Prediction Across Diverse Cellular Contexts via Diffusion Language Models[J]. bioRxiv, 2026: 2026.03. 18.712807.
...more
View all episodesView all episodes
Download on the App Store

Paper TalkBy 淼淼Elva