Share MetaScale: Test-Time Scaling with Evolving Meta-Thoughts

Copy link

August 08, 2025

MetaScale: Test-Time Scaling with Evolving Meta-Thoughts

36 minutes

The source introduces MetaScale, a novel framework designed to enhance Large Language Models' (LLMs) complex reasoning capabilities during inference. Unlike traditional approaches that rely on fixed reasoning patterns, MetaScale enables LLMs to adaptively select and refine cognitive strategies, termed "meta-thoughts," for each task. It initializes a diverse pool of these strategies, then iteratively selects and evaluates them using a multi-armed bandit algorithm, guided by a reward model. To foster continuous improvement, a genetic algorithm evolves high-performing meta-thoughts, refining the strategy pool over time. Experiments demonstrate that MetaScale consistently outperforms existing methods in accuracy and generalization, notably showing an 11% performance gain for GPT-4o on Arena-Hard, by producing more structured and expert-level responses as sampling budgets increase.

Source: https://arxiv.org/html/2503.13447v1

...more

View all episodes

By mcgrof

August 08, 2025

MetaScale: Test-Time Scaling with Evolving Meta-Thoughts

36 minutes

Source: https://arxiv.org/html/2503.13447v1

...more

Sign up to save your podcasts