March 05, 2026

Humanity's Last Exam: The Test AI Keeps Failing

40 minutes

2,500 questions no AI can Google. GPT-4o scored 2.7%, humans hit 90%. Inside the hardest AI benchmark and its 30% error rate.

...more

By Sigmatic

March 05, 2026

40 minutes

2,500 questions no AI can Google. GPT-4o scored 2.7%, humans hit 90%. Inside the hardest AI benchmark and its 30% error rate.

...more

Share Humanity's Last Exam: The Test AI Keeps Failing