June 06, 2025

Quantitative Judges for Large Language Models

18 minutes

This paper introduces quantitative LLM judges, a new approach for evaluating the output of large language models (LLMs) that aims to improve upon the "LLM-as-a-judge" framework. The core idea is to decouple the qualitative reasoning provided by an LLM judge (its textual evaluation) from the quantitative scoring. The framework utilizes a two-stage process where a frozen LLM provides a textual evaluation and initial score, and then a separate, lightweight model (like a generalized linear model) uses this output to predict a more accurate human-aligned score. The paper proposes four specific quantitative judges for different evaluation tasks (absolute rating and relative preference) and demonstrates that this method is both computationally and statistically efficient, often outperforming traditional fine-tuning of LLMs on various evaluation metrics across different datasets and base LLMs.

...more

View all episodes

By Enoch H. Kang

June 06, 2025

Quantitative Judges for Large Language Models

18 minutes

...more

Share Quantitative Judges for Large Language Models

Sign up to save your podcasts

Quantitative Judges for Large Language Models

Quantitative Judges for Large Language Models