
Sign up to save your podcasts
Or
I spent the last few months trying to tackle the problem of adversarial attacks in computer vision from the ground up. The results of this effort are written up in our new paper Ensemble everything everywhere: Multi-scale aggregation for adversarial robustness (explainer on X/Twitter). Taking inspiration from biology, we reached state-of-the-art or above state-of-the-art robustness at 100x – 1000x less compute, got human-understandable interpretability for free, turned classifiers into generators, and designed transferable adversarial attacks on closed-source (v)LLMs such as GPT-4 or Claude 3. I strongly believe that there is a compelling case for devoting serious attention to solving the problem of adversarial robustness in computer vision, and I try to draw an analogy to the alignment of general AI systems here.
1. Introduction
In this post, I argue that the problem of adversarial attacks in computer vision is in many ways analogous to the larger task [...]
---
Outline:
(00:58) 1. Introduction
(02:20) 2. Communicating implicit human functions to machines
(05:12) 3. Extremely rare yet omnipresent failure modes
(08:14) 4. Brute force enumerative safety is not sufficient
(10:50) 5. Conclusion
---
First published:
Source:
Narrated by TYPE III AUDIO.
I spent the last few months trying to tackle the problem of adversarial attacks in computer vision from the ground up. The results of this effort are written up in our new paper Ensemble everything everywhere: Multi-scale aggregation for adversarial robustness (explainer on X/Twitter). Taking inspiration from biology, we reached state-of-the-art or above state-of-the-art robustness at 100x – 1000x less compute, got human-understandable interpretability for free, turned classifiers into generators, and designed transferable adversarial attacks on closed-source (v)LLMs such as GPT-4 or Claude 3. I strongly believe that there is a compelling case for devoting serious attention to solving the problem of adversarial robustness in computer vision, and I try to draw an analogy to the alignment of general AI systems here.
1. Introduction
In this post, I argue that the problem of adversarial attacks in computer vision is in many ways analogous to the larger task [...]
---
Outline:
(00:58) 1. Introduction
(02:20) 2. Communicating implicit human functions to machines
(05:12) 3. Extremely rare yet omnipresent failure modes
(08:14) 4. Brute force enumerative safety is not sufficient
(10:50) 5. Conclusion
---
First published:
Source:
Narrated by TYPE III AUDIO.
26,409 Listeners
2,387 Listeners
7,908 Listeners
4,131 Listeners
87 Listeners
1,457 Listeners
9,042 Listeners
87 Listeners
388 Listeners
5,432 Listeners
15,201 Listeners
474 Listeners
122 Listeners
75 Listeners
454 Listeners