
Sign up to save your podcasts
Or


Coauthored by Dmitrii Volkov_^1_, Christian Schroeder de Witt_^2_, Jeffrey Ladish_^1_ (_^1_Palisade Research, _^2_University of Oxford).
We explore how frontier AI labs could assimilate operational security (opsec) best practices from fields like nuclear energy and construction to mitigate near-term safety risks stemming from AI R&D process compromise. Such risks in the near-term include model weight leaks and backdoor insertion, and loss of control in the longer-term.
We discuss the Mistral and LLaMA model leaks as motivating examples and propose two classic opsec mitigations: performing AI audits in secure reading rooms (SCIFs) and using locked-down computers for frontier AI research.
Mistral model leak
In January 2024, a high-quality 70B LLM leaked from Mistral. Reporting suggests the model leaked through an external evaluation or product design process. That is, Mistral shared the full model with a few other companies and one of their employees leaked the model.
Mistral CEO suggesting adding attribution to [...]---
Outline:
(00:58) Mistral model leak
(01:38) Potential industry response
(03:00) SCIFs / secure reading rooms
(04:23) Locked-down laptops
(05:26) Side benefits
(05:48) Conclusion
The original text contained 5 footnotes which were omitted from this narration.
---
First published:
Source:
Narrated by TYPE III AUDIO.
By LessWrongCoauthored by Dmitrii Volkov_^1_, Christian Schroeder de Witt_^2_, Jeffrey Ladish_^1_ (_^1_Palisade Research, _^2_University of Oxford).
We explore how frontier AI labs could assimilate operational security (opsec) best practices from fields like nuclear energy and construction to mitigate near-term safety risks stemming from AI R&D process compromise. Such risks in the near-term include model weight leaks and backdoor insertion, and loss of control in the longer-term.
We discuss the Mistral and LLaMA model leaks as motivating examples and propose two classic opsec mitigations: performing AI audits in secure reading rooms (SCIFs) and using locked-down computers for frontier AI research.
Mistral model leak
In January 2024, a high-quality 70B LLM leaked from Mistral. Reporting suggests the model leaked through an external evaluation or product design process. That is, Mistral shared the full model with a few other companies and one of their employees leaked the model.
Mistral CEO suggesting adding attribution to [...]---
Outline:
(00:58) Mistral model leak
(01:38) Potential industry response
(03:00) SCIFs / secure reading rooms
(04:23) Locked-down laptops
(05:26) Side benefits
(05:48) Conclusion
The original text contained 5 footnotes which were omitted from this narration.
---
First published:
Source:
Narrated by TYPE III AUDIO.

113,488 Listeners

132 Listeners

7,247 Listeners

562 Listeners

16,493 Listeners

4 Listeners

14 Listeners

2 Listeners