May 23, 2025

Large Language Models Are (Bayesian) Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning

18 minutes

This academic paper proposes a novel approach to understanding how large language models (LLMs) learn from example demonstrations provided within the input, a process called in-context learning. The authors suggest viewing LLMs through a Bayesian perspective, considering them as implicitly inferring a latent variable that encapsulates task information. Based on this theory, they developed an algorithm to select the most effective demonstrations by training a smaller LLM to identify examples most likely to reveal this latent concept. Remarkably, the selected demonstrations can be generalized to larger LLMs, significantly boosting performance on various text classification and math problems compared to baseline methods, providing empirical support for their hypothesis.

...more

View all episodes

By Enoch H. Kang

May 23, 2025

Large Language Models Are (Bayesian) Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning

18 minutes

...more

Share Large Language Models Are (Bayesian) Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning

Sign up to save your podcasts

Large Language Models Are (Bayesian) Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning

Large Language Models Are (Bayesian) Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context Learning