Unsupervised Learning

Using the Smartest AI to Rate Other AI


Listen Later

In this episode, I walk through a Fabric Pattern that assesses how well a given model does on a task relative to humans. This system uses your smartest AI model to evaluate the performance of other AIs—by scoring them across a range of tasks and comparing them to human intelligence levels.

I talk about:

1. Using One AI to Evaluate Another
The core idea is simple: use your most capable model (like Claude 3 Opus or GPT-4) to judge the outputs of another model (like GPT-3.5 or Haiku) against a task and input. This gives you a way to benchmark quality without manual review.

2. A Human-Centric Grading System
Models are scored on a human scale—from “uneducated” and “high school” up to “PhD” and “world-class human.” Stronger models consistently rate higher, while weaker ones rank lower—just as expected.

3. Custom Prompts That Push for Deeper Evaluation
The rating prompt includes instructions to emulate a 16,000+ dimensional scoring system, using expert-level heuristics and attention to nuance. The system also asks the evaluator to describe what would have been required to score higher, making this a meta-feedback loop for improving future performance.

Note: This episode was recorded a few months ago, so the AI models mentioned may not be the latest—but the framework and methodology still work perfectly with current models.

Subscribe to the newsletter at:
https://danielmiessler.com/subscribe

Join the UL community at:
https://danielmiessler.com/upgrade

Follow on X:
https://x.com/danielmiessler

Follow on LinkedIn:
https://www.linkedin.com/in/danielmiessler

See you in the next one!

Become a Member: https://danielmiessler.com/upgrade

See omnystudio.com/listener for privacy information.

...more
View all episodesView all episodes
Download on the App Store

Unsupervised LearningBy Daniel Miessler

  • 4.6
  • 4.6
  • 4.6
  • 4.6
  • 4.6

4.6

134 ratings


More shows like Unsupervised Learning

View all
Security Now (Audio) by TWiT

Security Now (Audio)

1,961 Listeners

Risky Business by Patrick Gray

Risky Business

363 Listeners

SANS Internet Stormcenter Daily Cyber Security Podcast (Stormcast) by Johannes B. Ullrich

SANS Internet Stormcenter Daily Cyber Security Podcast (Stormcast)

634 Listeners

Defensive Security Podcast - Malware, Hacking, Cyber Security & Infosec by Jerry Bell and Andrew Kalat

Defensive Security Podcast - Malware, Hacking, Cyber Security & Infosec

369 Listeners

CyberWire Daily by N2K Networks

CyberWire Daily

1,006 Listeners

Smashing Security by Graham Cluley & Carole Theriault

Smashing Security

313 Listeners

Click Here by Recorded Future News

Click Here

387 Listeners

Darknet Diaries by Jack Rhysider

Darknet Diaries

7,836 Listeners

Cybersecurity Today by Jim Love

Cybersecurity Today

142 Listeners

CISO Series Podcast by David Spark, Mike Johnson, and Andy Ellis

CISO Series Podcast

182 Listeners

Hacking Humans by N2K Networks

Hacking Humans

310 Listeners

Defense in Depth by David Spark, Steve Zalewski, Geoff Belknap

Defense in Depth

72 Listeners

Cyber Security Headlines by CISO Series

Cyber Security Headlines

120 Listeners

Risky Bulletin by risky.biz

Risky Bulletin

33 Listeners

Hacker And The Fed by Chris Tarbell & Hector Monsegur

Hacker And The Fed

159 Listeners