
Sign up to save your podcasts
Or
s1: simple test time scaling

- Test-time scaling improves language model performance using extra compute
- A dataset of 1,000 questions was curated for validation
- Budget forcing controls compute by managing the model's reasoning process
- The model outperformed o1-preview by up to 27% on math questions
- The model and data are open-source for public access
...more