Share Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward

Copy link

October 04, 2025

Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward

14 minutes

This paper introduces CURIO (Curiosity-driven User-modeling Reward as an Intrinsic Objective), a novel framework for enhancing personalized multi-turn dialogue in large language models (LLMs). This research addresses the limitations of conventional methods like Reinforcement Learning from Human Feedback (RLHF), which often fail to personalize interactions dynamically for individual users. CURIO integrates a curiosity-based intrinsic reward derived from a user model, encouraging the LLM agent to actively infer user traits and preferences throughout the conversation to improve its user model's accuracy. By formulating personalized dialogue as a Partially Observable Markov Decision Process (POMDP) and connecting the intrinsic reward to Potential-based Reward Shaping (PBRS) theory, the authors demonstrate that CURIO significantly improves personalization performance and generalization in tasks such as conversational recommendations and educational dialogues. The overall goal is to create more adaptive and engaging conversational agents by training them to learn about the user during the interaction.

...more

View all episodes

By Enoch H. Kang

October 04, 2025

Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward

14 minutes

...more

Sign up to save your podcasts