Big Technology Podcast

How An AI Model Learned To Be Bad — With Evan Hubinger And Monte MacDiarmid


Listen Later

Evan Hubinger is Anthropic’s alignment stress test lead. Monte MacDiarmid is a researcher in misalignment science at Anthropic.The two join Big Technology to discuss their new research on reward hacking and emergent misalignment in large language models. Tune in to hear how cheating on coding tests can spiral into models faking alignment, blackmailing fictional CEOs, sabotaging safety tools, and even developing apparent “self-preservation” drives. We also cover Anthropic’s mitigation strategies like inoculation prompting, whether today’s failures are a preview of something far worse, how much to trust labs to police themselves, and what it really means to talk about an AI’s “psychology.” Hit play for a clear-eyed, concrete, and unnervingly fun tour through the frontier of AI safety.

---

Enjoying Big Technology Podcast? Please rate us five stars ⭐⭐⭐⭐⭐ in your podcast app of choice.

Want a discount for Big Technology on Substack + Discord? Here’s 25% off for the first year: https://www.bigtechnology.com/subscribe?coupon=0843016b

Questions? Feedback? Write to: [email protected]

---

Wealthfront.com/bigtech⁠. If eligible for the overall boosted 4.15% rate offered with this promo, your boosted rate is subject to change if the 3.50% base rate decreases during the 3-month promo period.

The Cash Account, which is not a deposit account, is offered by Wealthfront Brokerage LLC ("Wealthfront Brokerage"), Member FINRA/SIPC, not a bank. The Annual Percentage Yield ("APY") on cash deposits as of 11/7/25, is representative, requires no minimum, and may change at any time. The APY reflects the weighted average of deposit balances at participating Program Banks, which are not allocated equally. Wealthfront Brokerage sweeps cash balances to Program Banks, where they earn the variable base APY. Instant withdrawals are subject to certain conditions and processing times may vary.

Learn more about your ad choices. Visit megaphone.fm/adchoices

...more
View all episodesView all episodes
Download on the App Store

Big Technology PodcastBy Alex Kantrowitz

  • 4.7
  • 4.7
  • 4.7
  • 4.7
  • 4.7

4.7

479 ratings


More shows like Big Technology Podcast

View all
This Week in Startups by Jason Calacanis

This Week in Startups

1,291 Listeners

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch by Harry Stebbings

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

535 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,095 Listeners

Decoder with Nilay Patel by The Verge

Decoder with Nilay Patel

3,144 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

345 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

225 Listeners

Tech Brew Ride Home by Morning Brew

Tech Brew Ride Home

969 Listeners

Practical AI by Practical AI LLC

Practical AI

202 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

534 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

140 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

99 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

225 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

637 Listeners

BG2Pod with Brad Gerstner and Bill Gurley by BG2Pod

BG2Pod with Brad Gerstner and Bill Gurley

466 Listeners

AI + a16z by a16z

AI + a16z

33 Listeners