Epikurious

From Training to Thinking: Optimizing AI for Real-World Challenges


Listen Later

Summary: This research paper explores how to optimally increase the computational resources used by large language models (LLMs) during inference, rather than solely focusing on increasing model size during training. The authors investigate two main strategies: refining the model's output iteratively (revisions) and employing improved search algorithms with a process-based verifier (PRM). They find that a "compute-optimal" approach, adapting the strategy based on prompt difficulty, significantly improves efficiency and can even outperform much larger models in certain scenarios. Their experiments using the MATH benchmark and PaLM 2 models show that test-time compute scaling can be a more effective alternative to increasing model parameters, especially for easier problems or those with lower inference token requirements. However, for extremely difficult problems, increased pre-training compute remains superior.

...more
View all episodesView all episodes
Download on the App Store

EpikuriousBy Alejandro Santamaria Arza