
Sign up to save your podcasts
Or
Many, many signs of life for preference fine-tuning beyond spoofing chat evaluation tools.
This is AI generated audio with Python and 11Labs.
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/how-rlhf-works-2
00:00 How RLHF works, part 2: A thin line between useful and lobotomized
04:27 The chattiness paradox
08:09 The mechanism for making models chattier
10:42 Next steps for RLHF research
Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf/img_012.webp
Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf/img_018.png
Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf/img_025.png
4.1
99 ratings
Many, many signs of life for preference fine-tuning beyond spoofing chat evaluation tools.
This is AI generated audio with Python and 11Labs.
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/how-rlhf-works-2
00:00 How RLHF works, part 2: A thin line between useful and lobotomized
04:27 The chattiness paradox
08:09 The mechanism for making models chattier
10:42 Next steps for RLHF research
Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf/img_012.webp
Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf/img_018.png
Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf/img_025.png
1,003 Listeners
513 Listeners
270 Listeners
193 Listeners
200 Listeners
281 Listeners
89 Listeners
349 Listeners
123 Listeners
190 Listeners
62 Listeners
138 Listeners
445 Listeners
29 Listeners
31 Listeners