AI Safety Fundamentals

Constitutional AI Harmlessness from AI Feedback


Listen Later

This paper explains Anthropic’s constitutional AI approach, which is largely an extension on RLHF but with AIs replacing human demonstrators and human evaluators.

A podcast by BlueDot Impact.

Learn more on the AI Safety Fundamentals website.

...more
View all episodesView all episodes
Download on the App Store

AI Safety FundamentalsBy BlueDot Impact