Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Large Language Models will be Great for Censorship, published by Ethan Edwards on August 22, 2023 on LessWrong.
Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort
Thanks to ev_ and Kei for suggestions on this post.
LLMs can do many incredible things. They can generate unique creative content, carry on long conversations in any number of subjects, complete complex cognitive tasks, and write nearly any argument. More mundanely, they are now the state of the art for boring classification tasks and therefore have the capability to radically upgrade the censorship capacities of authoritarian regimes throughout the world.
How Censorship Worked
In totalitarian government states with wide censorship - Tsarist Russia, Eastern Bloc Communist states, the People's Republic of China, Apartheid South Africa, etc - all public materials are ideally read and reviewed by government workers to ensure they contain nothing that might be offensive to the regime. This task is famously extremely boring and the censors would frequently miss obviously subversive material because they did not bother to go through everything. Marx's Capital was thought to be uninteresting economics so made it into Russia legally in the 1890s.
The old style of censorship could not possibly scale, and the real way that censors exert control is through deterrence and fear rather than actual control of communication. Nobody knows the strict boundary line over which they cannot cross, and therefore they stay well away from it. It might be acceptable to lightly criticize one small part of the government that is currently in disfavor, but why risk your entire future on a complaint that likely goes nowhere? In some regimes such as the PRC under Mao, chaotic internal processes led to constant reversals of acceptable expression and by the end of the Cultural Revolution most had learned that simply being quiet was the safest path. Censorship prevents organized resistance in the public and ideally for the regime this would lead to tacit acceptance of the powers that be, but a silently resentful population is not safe or secure.
When revolution finally comes, the whole population might turn on their rulers with all of their suppressed rage released at once. Everyone knows that everyone knows that everyone hates the government, even if they can only acknowledge this in private trusted channels.
Because proper universal and total surveillance has always been impractical, regimes have instead focused on more targeted interventions to prevent potential subversion. Secret polices rely on targeted informant networks, not on workers who can listen to every minute of every recorded conversation. This had a horrible and chilling effect and destroyed many lives, but also was not as effective as it could have been. Major resistance leaders were still able to emerge in totalitarian states, and once the government showed signs of true weakness there were semi-organized dissidents ready to seize the moment.
Digital Communication and the Elusiveness of Total Censorship
Traditional censorship mostly dealt with a relatively small number of published works: newspapers, books, films, radio, television. This was somewhat manageable just using human labor. However in the past two decades, the amount of communication and material that is potentially public has been transformed with the internet.
It is much harder to know how governments are handling new data because the information we have mostly comes from the victims of surveillance who are kept in the same deterrent fear as the past. If victims imagine the state is more capable than it is, that means the state is succeeding, and it is harder to assess the true capabilities. We don't have reliable accounts from insiders or archival access since no major regi...