The paper introduces
scArches, a novel deep learning strategy based on
transfer learning (TL) and
architectural surgery, designed for efficiently mapping new, or query, single-cell datasets onto existing, large-scale reference atlases. This method addresses the challenges in single-cell genomics of
batch effects between datasets, limited
computational resources, and
data sharing restrictions by only requiring fine-tuning a small subset of the network parameters (
adaptors). The
scArches pipeline enables decentralized, iterative building and updating of reference models without needing to share raw data, offering significant speed and efficiency advantages over traditional, full data integration techniques. Furthermore, the paper demonstrates scArches' capability to preserve
biological variation—including
disease-specific cell states and
rare cell types—while effectively removing technical batch effects, and facilitates knowledge transfer for
cell type annotation and
imputing missing data modalities.
References:
- Lotfollahi M, Naghipourfar M, Luecken M D, et al. Mapping single-cell data to reference atlases by transfer learning[J]. Nature biotechnology, 2022, 40(1): 121-130.