March 20, 2025

Reward Models | Data Brew | Episode 40

Listen Later

39 minutes

In this episode, Brandon Cui, Research Scientist at MosaicML and Databricks, dives into cutting-edge advancements in AI model optimization, focusing on Reward Models and Reinforcement Learning from Human Feedback (RLHF).

Highlights include:
- How synthetic data and RLHF enable fine-tuning models to generate preferred outcomes.
- Techniques like Policy Proximal Optimization (PPO) and Direct Preference
Optimization (DPO) for enhancing response quality.
- The role of reward models in improving coding, math, reasoning, and other NLP tasks.

Connect with Brandon Cui:
https://www.linkedin.com/in/bcui19/

...more

View all episodes

View all episodes

Download on the App Store

Download on the App Store

Get it on Google Play

Data Brew by Databricks

By Databricks

4.8

2020 ratings

March 20, 2025

Reward Models | Data Brew | Episode 40

Listen Later

39 minutes

In this episode, Brandon Cui, Research Scientist at MosaicML and Databricks, dives into cutting-edge advancements in AI model optimization, focusing on Reward Models and Reinforcement Learning from Human Feedback (RLHF).

Highlights include:
- How synthetic data and RLHF enable fine-tuning models to generate preferred outcomes.
- Techniques like Policy Proximal Optimization (PPO) and Direct Preference
Optimization (DPO) for enhancing response quality.
- The role of reward models in improving coding, math, reasoning, and other NLP tasks.

Connect with Brandon Cui:
https://www.linkedin.com/in/bcui19/

...more

More shows like Data Brew by Databricks

The McKinsey Podcast by McKinsey & Company

The McKinsey Podcast

391 Listeners

Making Sense with Sam Harris by Sam Harris

Making Sense with Sam Harris

26,409 Listeners

Pivot by New York Magazine

Pivot

9,748 Listeners

Data Skeptic by Kyle Polich

Data Skeptic

479 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

629 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

303 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

234 Listeners

DataFramed by DataCamp

DataFramed

267 Listeners

The Intelligence from The Economist by The Economist

The Intelligence from The Economist

2,550 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

10,205 Listeners

Barron's Streetwise by Barron's

Barron's Streetwise

1,582 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

564 Listeners

Coaching Real Leaders by Harvard Business Review / Muriel Wilkins

Coaching Real Leaders

670 Listeners

On with Kara Swisher by Vox Media

On with Kara Swisher

3,524 Listeners

AI + a16z by a16z

AI + a16z

32 Listeners