Data Brew by Databricks

Reward Models | Data Brew | Episode 40


Listen Later

In this episode, Brandon Cui, Research Scientist at MosaicML and Databricks, dives into cutting-edge advancements in AI model optimization, focusing on Reward Models and Reinforcement Learning from Human Feedback (RLHF).

Highlights include:
- How synthetic data and RLHF enable fine-tuning models to generate preferred outcomes.
- Techniques like Policy Proximal Optimization (PPO) and Direct Preference
Optimization (DPO) for enhancing response quality.
- The role of reward models in improving coding, math, reasoning, and other NLP tasks.

Connect with Brandon Cui:
https://www.linkedin.com/in/bcui19/

...more
View all episodesView all episodes
Download on the App Store

Data Brew by DatabricksBy Databricks

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

20 ratings


More shows like Data Brew by Databricks

View all
The McKinsey Podcast by McKinsey & Company

The McKinsey Podcast

390 Listeners

Making Sense with Sam Harris by Sam Harris

Making Sense with Sam Harris

26,330 Listeners

Pivot by New York Magazine

Pivot

9,539 Listeners

Data Skeptic by Kyle Polich

Data Skeptic

479 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

625 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

302 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

226 Listeners

DataFramed by DataCamp

DataFramed

269 Listeners

The Intelligence from The Economist by The Economist

The Intelligence from The Economist

2,548 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

9,927 Listeners

Barron's Streetwise by Barron's

Barron's Streetwise

1,566 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

511 Listeners

Coaching Real Leaders by Harvard Business Review / Muriel Wilkins

Coaching Real Leaders

676 Listeners

On with Kara Swisher by Vox Media

On with Kara Swisher

3,531 Listeners

AI + a16z by a16z

AI + a16z

35 Listeners