November 27, 2024

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

25 minutes

The document details the creation and evaluation of TÜLU 3, a family of open-source, post-trained language models. TÜLU 3 surpasses several closed and open models in various benchmarks by using a multi-stage training process incorporating supervised fine-tuning, Direct Preference Optimization, and a novel Reinforcement Learning with Verifiable Rewards method. The research includes a rigorous evaluation framework with development and unseen datasets to assess generalization capabilities and identify areas for improvement. A key focus is on transparency, releasing all data, code, and training recipes. Finally, the authors explore various training choices and their effects on model performance.

https://allenai.org/papers/tulu-3-report.pdf

...more

View all episodes

By AIPPD

November 27, 2024

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

25 minutes

https://allenai.org/papers/tulu-3-report.pdf

...more

Share TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

Sign up to save your podcasts

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training