
Sign up to save your podcasts
Or


Associated Paper: link[1]
Paper Co-authors: Roy Rinberg, Annabelle Michael Carrell, Simon Henniger, Nicholas Carlini, Keri Warr
Two years ago, Ryan Greenblatt wrote a post titled "Preventing model exfiltration with upload limits". The idea of the original post is simple: if there's a single wire connecting to a GPU that allows egress of 100 GB per day, then even if an adversary has full control of the GPU, there's nothing they can do to exfiltrate more than 100 GB in a single day.
The main challenge is that you can't set your egress limit lower than the amount of data you need to send to your customers. To contextualize this with some rough approximations: my estimate is that OpenAI produces about 1 TB of text per day, and a frontier model is approximately 1 TB[2]. So egress limiting alone buys us about 1 day before an adversary could steal the weights.
With a good monitoring and incident-response system in place, an attacker is incentivized to exfiltrate as much as possible while staying under the detection threshold. Any exfiltration attempt is forced to tradeoff stealthiness against exfiltration-volume: the more data you try to steal, the more likely you are to get [...]
---
Outline:
(05:07) Lossless Compression with LLMs
(07:57) Lossy Compression
(09:41) Question-Asking Compression (QA)
(14:17) Future Directions
(14:29) Potential Technical Improvements to be Made to QA compression
(16:13) A possible future: Recursively Asking the Wisest Monk
(18:34) Philosophizing
(20:57) Ending Note
(21:18) Acknowledgements
(21:36) Appendix: Analysis and Examples of QA Compression Transcripts
The original text contained 6 footnotes which were omitted from this narration.
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
By LessWrongAssociated Paper: link[1]
Paper Co-authors: Roy Rinberg, Annabelle Michael Carrell, Simon Henniger, Nicholas Carlini, Keri Warr
Two years ago, Ryan Greenblatt wrote a post titled "Preventing model exfiltration with upload limits". The idea of the original post is simple: if there's a single wire connecting to a GPU that allows egress of 100 GB per day, then even if an adversary has full control of the GPU, there's nothing they can do to exfiltrate more than 100 GB in a single day.
The main challenge is that you can't set your egress limit lower than the amount of data you need to send to your customers. To contextualize this with some rough approximations: my estimate is that OpenAI produces about 1 TB of text per day, and a frontier model is approximately 1 TB[2]. So egress limiting alone buys us about 1 day before an adversary could steal the weights.
With a good monitoring and incident-response system in place, an attacker is incentivized to exfiltrate as much as possible while staying under the detection threshold. Any exfiltration attempt is forced to tradeoff stealthiness against exfiltration-volume: the more data you try to steal, the more likely you are to get [...]
---
Outline:
(05:07) Lossless Compression with LLMs
(07:57) Lossy Compression
(09:41) Question-Asking Compression (QA)
(14:17) Future Directions
(14:29) Potential Technical Improvements to be Made to QA compression
(16:13) A possible future: Recursively Asking the Wisest Monk
(18:34) Philosophizing
(20:57) Ending Note
(21:18) Acknowledgements
(21:36) Appendix: Analysis and Examples of QA Compression Transcripts
The original text contained 6 footnotes which were omitted from this narration.
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

112,326 Listeners

130 Listeners

7,242 Listeners

559 Listeners

16,321 Listeners

4 Listeners

14 Listeners

2 Listeners