The paper introduces
CARE (Cross-modal Adaptive Region Encoder), a novel foundation model designed to improve
computational pathology by moving beyond rigid, grid-based image analysis. Unlike traditional models that treat
whole-slide images (WSIs) as collections of isolated square patches, CARE utilizes an
adaptive region generator to partition tissue into irregular, morphologically meaningful chunks that respect biological boundaries. The model undergoes a
two-stage pretraining process, first learning morphological structures through self-supervised methods and then refining those representations by aligning them with
molecular data, such as RNA and protein profiles. This biologically guided approach allows CARE to identify significant
regions of interest (ROIs) and aggregate them into comprehensive slide-level embeddings. Despite using significantly less pretraining data than its competitors, CARE demonstrates superior performance across
33 clinical benchmarks, including cancer classification and survival analysis. Ultimately, the research offers a more
interpretable and data-efficient framework for diagnostic AI by better mimicking the workflow of human pathologists.
References:
- Zhang D, Gong Z, Pang X, et al. CARE: A Molecular-Guided Foundation Model with Adaptive Region Modeling for Whole Slide Image Analysis[J]. arXiv preprint arXiv:2602.21637, 2026.