This September 20 2025 paper introduce a novel, efficient architecture for training retrieval models used in retrieval-augmented generation (RAG) systems. This architecture addresses the inefficiency of fine-tuning large models by combining adapters for soft embeddings with a Classifier-as-Retriever (CaR) approach. The soft embeddings, created by lightweight layers in a frozen small language model (SLM), efficiently adapt the model to new corpora, while the CaR replaces static maximum inner product search (MIPS) with a trainable classifier for significantly higher accuracy (up to 99%). Furthermore, the methods integrate naturally with federated learning (FL) to achieve distributed training speedups (up to 2.6x faster) and utilize differential privacy (DP) techniques to safeguard client data during edge device training. This combined approach results in a lighter, faster, and privacy-preserving solution for domain-specific RAG.Sources:https://www.webai.com/blog/federated-learning-with-soft-embeddings-a-new-efficient-way-to-train-retrieval-modelshttps://arxiv.org/pdf/2509.16508