LessWrong (30+ Karma)

“Sanity-checking “Incompressible Knowledge Probes”” by Sturb, LawrenceC


Listen Later

Or, did a chief scientist of an AI assistant startup conclusively show that GPT-5.5 has 9.7 trillion parameters?

Introduction

Recently, a paper was circulated on Twitter claiming to have reverse engineered the parameter count of many frontier closed-source models including the newer GPT-5.5 (9.7 trillion parameters) and Claude Opus 4.7 (4.0 trillion parameters) as well as older models such as o1 (3.5T) and gpt-4o (720B). The paper, titled “Incompressible Knowledge Probes: Estimating Black-Box LLM Parameter Counts via Factual Capacity”, introduces a dataset of factual knowledge of different difficulties, regresses performance on this dataset against parameter count, and then uses this regression to extrapolate from the performance of closed-sourced frontier models to their parameter count. A notable fact about this paper is that, unlike most empirical machine learning papers, it's single-authored: Bojie Li, the chief scientist of Pine AI, is the sole author of this piece.

These results were suspicious for many reasons, the primary being that it seems like low-effort, hastily-written AI slop. For example, the codebase (https://github.com/19PINE-AI/ikp) was constructed in large part with Claude Code and has many of the flags for code that is almost entirely vibe-coded with little sanity checking (e.g. redundant and inconsistent variable definitions[1] [...]

---

Outline:

(00:19) Introduction

(04:19) Summary of Lis Incompressible Knowledge Probes

(08:04) The IKP dataset.

(11:59) IKP scoring and Regression Methodology

(16:54) Methodological Issues with the IKP paper

(17:24) Per-tier floors to the scoring

(19:27) Ambiguous/incorrect answers to hard questions

(23:21) Corrected model parameter estimates

(24:17) Possible methodological issues that mattered less than we thought

(24:22) Thinking vs non-thinking

(25:59) Different accuracy metrics used in some repository json files

(26:31) Conclusion

(29:46) Discussion

The original text contained 17 footnotes which were omitted from this narration.

---

First published:

May 1st, 2026

Source:

https://www.lesswrong.com/posts/veFMEzDDyWaer2Sms/sanity-checking-incompressible-knowledge-probes

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more
View all episodesView all episodes
Download on the App Store

LessWrong (30+ Karma)By LessWrong


More shows like LessWrong (30+ Karma)

View all
The Daily by The New York Times

The Daily

112,330 Listeners

Astral Codex Ten Podcast by Jeremiah

Astral Codex Ten Podcast

130 Listeners

Interesting Times with Ross Douthat by New York Times Opinion

Interesting Times with Ross Douthat

7,247 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

563 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

16,328 Listeners

AI Article Readings by Readings of great articles in AI voices

AI Article Readings

4 Listeners

Doom Debates! by Liron Shapira

Doom Debates!

14 Listeners

LessWrong posts by zvi by zvi

LessWrong posts by zvi

2 Listeners