Share Learning to summarize user information for personalized RLHF

Copy link

February 20, 2026

Learning to summarize user information for personalized RLHF

18 minutes

This paper introduces Preference Learning Using Summarization (PLUS), a novel framework designed to personalizing large language models (LLMs) by aligning them with diverse user preferences. Unlike standard methods that assume a single set of values for all users, PLUS uses reinforcement learning to generate concise, text-based summaries of a user's characteristics and past interactions. These summaries then condition a reward model, enabling it to make more accurate, personalized predictions about what a specific user values in a response. A central innovation of PLUS is the online co-adaptation loop, where the summarizer and the reward model are trained simultaneously to ensure the text summaries capture the most relevant information. Experiments demonstrate that PLUS significantly improves reward model accuracy and excels at generalizing to new topics and users compared to existing techniques. Furthermore, the framework allows for the personalization of proprietary models like GPT-4 without further fine-tuning, while offering human-readable summaries that enhance transparency and user control.

...more

View all episodes

By Enoch H. Kang

February 20, 2026

Learning to summarize user information for personalized RLHF

18 minutes

...more

Sign up to save your podcasts