Data Brew by Databricks

Reward Models | Data Brew | Episode 40


Listen Later

In this episode, Brandon Cui, Research Scientist at MosaicML and Databricks, dives into cutting-edge advancements in AI model optimization, focusing on Reward Models and Reinforcement Learning from Human Feedback (RLHF).

Highlights include:
- How synthetic data and RLHF enable fine-tuning models to generate preferred outcomes.
- Techniques like Policy Proximal Optimization (PPO) and Direct Preference
Optimization (DPO) for enhancing response quality.
- The role of reward models in improving coding, math, reasoning, and other NLP tasks.

Connect with Brandon Cui:
https://www.linkedin.com/in/bcui19/

...more
View all episodesView all episodes
Download on the App Store

Data Brew by DatabricksBy Databricks

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

20 ratings


More shows like Data Brew by Databricks

View all
The McKinsey Podcast by McKinsey & Company

The McKinsey Podcast

391 Listeners

Making Sense with Sam Harris by Sam Harris

Making Sense with Sam Harris

26,336 Listeners

Pivot by New York Magazine

Pivot

9,580 Listeners

Data Skeptic by Kyle Polich

Data Skeptic

479 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

625 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

303 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

227 Listeners

DataFramed by DataCamp

DataFramed

268 Listeners

The Intelligence from The Economist by The Economist

The Intelligence from The Economist

2,541 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

9,939 Listeners

Barron's Streetwise by Barron's

Barron's Streetwise

1,566 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

504 Listeners

Coaching Real Leaders by Harvard Business Review / Muriel Wilkins

Coaching Real Leaders

668 Listeners

On with Kara Swisher by Vox Media

On with Kara Swisher

3,465 Listeners

AI + a16z by a16z

AI + a16z

35 Listeners