Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Planning to build a cryptographic box with perfect secrecy, published by Lysandre Terrisse on January 1, 2024 on LessWrong.
Summary
Since September 2023, I started learning a lot of math and programming skills in order to develop the safest cryptographic box in the world (and yes, I am aiming high). In these four months, I learned important things you may want to know:
Fully Homomorphic Encryption (FHE) schemes with perfect secrecy do exist.
These FHE schemes do not need any computational assumption.
These FHE schemes are tractable (in the worst case, encrypting a program before running it makes it three times slower).
We can therefore run infinitely dangerous programs without obtaining any information about them or their outputs. This may be useful in order to run a superintelligence without destroying the world.
However, these schemes work only on quantum computers.
In this post, I will firstly talk about how I learned about this FHE scheme, then I will explain my plan for making this cryptographic box, and finally, I will mention some ethical concerns about this cryptographic box.
Before reading this post, I recommend you to read this post by Paul Christiano, and the comments that go with it. These are very informative, and they sharpened my views for this project. Paul Christiano presents a way to extract a friendly AI from an unfriendly one. This being only one example of what can be done with a cryptographic box, I will mostly consider cryptographic boxes as a solution to a problem that I call the malign computation problem.
Introduction
In August 2022, I started reading AGI Safety Literature Review. At one point, the authors tell this:
One way to box an AGI is to homomorphically encrypt it. Trask (2017) shows how to train homomorphically encrypted neural networks. By homomorphically encrypting an AGI, its predictions and actions also come out encrypted. A human operator with the secret key can choose to decrypt them only when he wants to.
When I have read this for the first time, I told myself that I should check this work because it seemed important.
And then I completely forgot about it.
Then, in April 2023, during a PHP lesson, I realized that the problem of processing a request made by a malevolent user is similar to the problem of boxing a superintelligence. After the lesson, I asked the teacher how to prevent code injections, and he gave me two answers:
Do not show your code to the public. This answer didn't convince me, because even current hackers know how to go around this precaution.
Encrypt the request before processing it. This is the moment I remembered the quote from AGI Safety Literature Review.
After looking back at every note that I made about AI Safety, I managed to find back the work made by Trask.
Trask's work
Trask's post shows how to build an encrypted AI using the Efficient Integer Vector Homomorphic Encryption. However, since this scheme (along with every other FHE scheme I know about on classical computers) relies on computational assumptions, we have some problems:
The scheme may not be safe. A computational assumption consists of stating "There is no efficient way to solve this problem". However, we do not know how to prove any such statement, as this would solve the PNP problem. Most FHE schemes (including this one) depend on the Learning With Errors (LWE) problem. Although LWE is quite secure for the moment, I won't bet the existence of all life on Earth on it. Similarly, I won't bet the safety of a superintelligence on it.
This scheme takes too long to compute. In practice, the first superintelligence will probably have more than a hundred billion weights and biases, making this scheme very expensive or even unusable.
This scheme isn't fully homomorphic. Basically, a cryptographic scheme is said to be homomorphic when we can run s...