AI Post Transformers

Federated Learning with Soft Embeddings for Retrieval


Listen Later

This September 20 2025 paper introduce a novel, efficient architecture for training retrieval models used in retrieval-augmented generation (RAG) systems. This architecture addresses the inefficiency of fine-tuning large models by combining adapters for soft embeddings with a Classifier-as-Retriever (CaR) approach. The soft embeddings, created by lightweight layers in a frozen small language model (SLM), efficiently adapt the model to new corpora, while the CaR replaces static maximum inner product search (MIPS) with a trainable classifier for significantly higher accuracy (up to 99%). Furthermore, the methods integrate naturally with federated learning (FL) to achieve distributed training speedups (up to 2.6x faster) and utilize differential privacy (DP) techniques to safeguard client data during edge device training. This combined approach results in a lighter, faster, and privacy-preserving solution for domain-specific RAG.Sources:https://www.webai.com/blog/federated-learning-with-soft-embeddings-a-new-efficient-way-to-train-retrieval-modelshttps://arxiv.org/pdf/2509.16508
...more
View all episodesView all episodes
Download on the App Store

AI Post TransformersBy mcgrof