What's AI Podcast by Louis-François Bouchard

OpenAI's NEW Fine-Tuning Method Changes EVERYTHING (Reinforcement Fine-Tuning Explained)


Listen Later

Have you ever wanted to take a language model and make it answer the way you want without needing a mountain of data?

Well, OpenAI’s got something for us: Reinforcement Fine-Tuning, or RFT, and it changes how we customize AI models. Instead of retraining it with feeding examples of what we want and hoping it learns in the classical way, we actually teach it by rewarding correct answers and penalizing wrong ones, just like training a dog — but, you know, with fewer treats and more math.

Let’s break down reinforcement fine-tuning compared to supervised fine-tuning!

Both essentially have their use that we can discuss in one line:

  1. Supervised fine-tuning teaches new things the model does not know yet, like a new language, which is powerful for small and less “intelligent” models.

  2. While reinforcement fine-tuning orients the current model to what we really want it to say. It basically “aligns” the model to our needs, but we need an already powerful model. This is why reasoning models are a perfect fit.

I’ve already covered fine-tuning on the channel if you are interested in that. Today, let’s get into how RFT actually works!

...more
View all episodesView all episodes
Download on the App Store

What's AI Podcast by Louis-François BouchardBy Louis-François Bouchard

  • 4.2
  • 4.2
  • 4.2
  • 4.2
  • 4.2

4.2

5 ratings


More shows like What's AI Podcast by Louis-François Bouchard

View all
No Such Thing As A Fish by No Such Thing As A Fish

No Such Thing As A Fish

4,841 Listeners

Last Week in AI by Skynet Today

Last Week in AI

281 Listeners

The Artificial Intelligence Show by Paul Roetzer and Mike Kaput

The Artificial Intelligence Show

154 Listeners

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

421 Listeners