June 05, 2025

Analyzing LLM Memorization and Generalization Quantitatively

56 minutes

Source : https://arxiv.org/abs/2505.24832

This research paper, "How much do language models memorize?" by Morris et al. (2025), introduces a novel method to estimate the extent of information a model retains about specific data points. The authors formally distinguish between "unintended memorization" (information about a specific dataset) and "generalization" (information about the true data-generation process). By focusing on unintended memorization, they estimate the capacity of language models, finding that models in the GPT family have an approximate capacity of 3.6 bits-per-parameter.

...more

View all episodes

By Benjamin Alloul 🗪 🅽🅾🆃🅴🅱🅾🅾🅺🅻🅼

June 05, 2025

Analyzing LLM Memorization and Generalization Quantitatively

56 minutes

Source : https://arxiv.org/abs/2505.24832

...more

Share Analyzing LLM Memorization and Generalization Quantitatively

Sign up to save your podcasts

Analyzing LLM Memorization and Generalization Quantitatively

Analyzing LLM Memorization and Generalization Quantitatively