LessWrong (30+ Karma)

“Text Compression Can Help Secure Model Weights” by Roy Rinberg


Listen Later

Associated Paper: link[1]

Paper Co-authors: Roy Rinberg, Annabelle Michael Carrell, Simon Henniger, Nicholas Carlini, Keri Warr

Two years ago, Ryan Greenblatt wrote a post titled "Preventing model exfiltration with upload limits". The idea of the original post is simple: if there's a single wire connecting to a GPU that allows egress of 100 GB per day, then even if an adversary has full control of the GPU, there's nothing they can do to exfiltrate more than 100 GB in a single day.

The main challenge is that you can't set your egress limit lower than the amount of data you need to send to your customers. To contextualize this with some rough approximations: my estimate is that OpenAI produces about 1 TB of text per day, and a frontier model is approximately 1 TB[2]. So egress limiting alone buys us about 1 day before an adversary could steal the weights.

With a good monitoring and incident-response system in place, an attacker is incentivized to exfiltrate as much as possible while staying under the detection threshold. Any exfiltration attempt is forced to tradeoff stealthiness against exfiltration-volume: the more data you try to steal, the more likely you are to get [...]

---

Outline:

(05:07) Lossless Compression with LLMs

(07:57) Lossy Compression

(09:41) Question-Asking Compression (QA)

(14:17) Future Directions

(14:29) Potential Technical Improvements to be Made to QA compression

(16:13) A possible future: Recursively Asking the Wisest Monk

(18:34) Philosophizing

(20:57) Ending Note

(21:18) Acknowledgements

(21:36) Appendix: Analysis and Examples of QA Compression Transcripts

The original text contained 6 footnotes which were omitted from this narration.

---

First published:

March 4th, 2026

Source:

https://www.lesswrong.com/posts/GcbkprYPCjXdysLq4/text-compression-can-help-secure-model-weights

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more
View all episodesView all episodes
Download on the App Store

LessWrong (30+ Karma)By LessWrong


More shows like LessWrong (30+ Karma)

View all
The Daily by The New York Times

The Daily

112,326 Listeners

Astral Codex Ten Podcast by Jeremiah

Astral Codex Ten Podcast

130 Listeners

Interesting Times with Ross Douthat by New York Times Opinion

Interesting Times with Ross Douthat

7,242 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

559 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

16,321 Listeners

AI Article Readings by Readings of great articles in AI voices

AI Article Readings

4 Listeners

Doom Debates! by Liron Shapira

Doom Debates!

14 Listeners

LessWrong posts by zvi by zvi

LessWrong posts by zvi

2 Listeners