
Sign up to save your podcasts
Or


This paper introduces EpiAgent, the first foundation model specifically designed for single-cell epigenomic data, focusing on single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) data. Developed by researchers primarily from Tsinghua University, EpiAgent is a transformer-based model pretrained on a massive corpus of human scATAC-seq data to analyze chromatin accessibility patterns. The model overcomes challenges like data sparsity by encoding chromatin accessibility as concise "cell sentences" and utilizes a bidirectional attention mechanism to capture cellular heterogeneity. The paper comprehensively validates EpiAgent's superior performance in various downstream tasks, including unsupervised feature extraction, supervised cell annotation, data imputation, and the prediction of cellular responses to both stimulated and genetic perturbations. Furthermore, EpiAgent demonstrates utility in integrating reference data, mapping query data, and enabling in-silico treatment for cancer by simulating key cis-regulatory element (cCRE) knockouts.
References:
By 淼淼ElvaThis paper introduces EpiAgent, the first foundation model specifically designed for single-cell epigenomic data, focusing on single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) data. Developed by researchers primarily from Tsinghua University, EpiAgent is a transformer-based model pretrained on a massive corpus of human scATAC-seq data to analyze chromatin accessibility patterns. The model overcomes challenges like data sparsity by encoding chromatin accessibility as concise "cell sentences" and utilizes a bidirectional attention mechanism to capture cellular heterogeneity. The paper comprehensively validates EpiAgent's superior performance in various downstream tasks, including unsupervised feature extraction, supervised cell annotation, data imputation, and the prediction of cellular responses to both stimulated and genetic perturbations. Furthermore, EpiAgent demonstrates utility in integrating reference data, mapping query data, and enabling in-silico treatment for cancer by simulating key cis-regulatory element (cCRE) knockouts.
References: