Data Brew by Databricks

Reward Models | Data Brew | Episode 40


Listen Later

In this episode, Brandon Cui, Research Scientist at MosaicML and Databricks, dives into cutting-edge advancements in AI model optimization, focusing on Reward Models and Reinforcement Learning from Human Feedback (RLHF).

Highlights include:
- How synthetic data and RLHF enable fine-tuning models to generate preferred outcomes.
- Techniques like Policy Proximal Optimization (PPO) and Direct Preference
Optimization (DPO) for enhancing response quality.
- The role of reward models in improving coding, math, reasoning, and other NLP tasks.

Connect with Brandon Cui:
https://www.linkedin.com/in/bcui19/

...more
View all episodesView all episodes
Download on the App Store

Data Brew by DatabricksBy Databricks

  • 5
  • 5
  • 5
  • 5
  • 5

5

18 ratings


More shows like Data Brew by Databricks

View all
The Cloudcast by Massive Studios

The Cloudcast

152 Listeners

a16z Podcast by Andreessen Horowitz

a16z Podcast

1,012 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

42 Listeners

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch by Harry Stebbings

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

504 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

627 Listeners

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

441 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

295 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

325 Listeners

Data Engineering Podcast by Tobias Macey

Data Engineering Podcast

141 Listeners

DataFramed by DataCamp

DataFramed

265 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

123 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

76 Listeners

BG2Pod with Brad Gerstner and Bill Gurley by BG2Pod

BG2Pod with Brad Gerstner and Bill Gurley

445 Listeners

Training Data by Sequoia Capital

Training Data

36 Listeners

The Pragmatic Engineer by Gergely Orosz

The Pragmatic Engineer

52 Listeners