Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: 4. Risks from causing illegitimate value change (performative predictors), published by Nora Ammann on October 26, 2023 on The AI Alignment Forum.
Unaligned AI systems may cause illegitimate value change. At the heart of this risk lies the observation that the malleability inherent to human values can be exploited in ways that make the resulting value change illegitimate. Recall that I take illegitimacy to follow from a lack of or (significant) impediment to a person's ability to self-determine and course-correct a value-change process.
Mechanisms causing illegitimate value change
Instantiations of this risk can already be observed today, such as in the case of recommender systems. It is worth spending a bit of time understanding this example before considering what lessons it can teach us about risks from advanced AI systems more generally. To this effect, I will draw on work by Hardt et al. (2022), which introduces the notion of 'performative power'. Performative power is a quantitative measure of 'the ability of a firm operating an algorithmic system, such as a digital content recommendation platform, to cause change in a population of participants' (p. 1). The higher the performative power of a firm, the higher its ability to 'benefit from steering the population towards more profitable [for the firm] behaviour' (p. 1). In other words, performative power allows us to measure the ability of the firm running the recommender systems to cause exogenously induced value change
[1]
in the customer population. The measure was specifically developed to advance the study of competition in digital economies, and in particular, to identify anti-competitive dynamics.
What is happening here? To better understand this, we can help ourselves to the distinction between 'ex-ante optimization' and 'ex-post optimization', introduced by Predomo et al. (2020). The former - ex-ante optimisation - is the type of predictive optimisation that occurs under conditions of low performative power, where a predictor (a firm in this case) cannot do better than the information that standard statistical learning allows to extract from past data about future data. Ex-post optimisation, on the other hand, involves steering the predicted behaviour such as to improve the predictor's predictive performance. In other words, in the first case, the to-be-predicted data is fixed and independent from the activity of the predictor, while in the second case, the to-be-predicted data is influenced by the prediction process. As Hardt et al. (2022) remark: '[Ex-post optimisation] corresponds to implicitly or explicitly optimising over the counterfactuals' (p. 7). In other words, an actor with high performative power does not only predict the most likely outcome;
functionally speaking,
it can perform as if it can choose which future scenarios to bring about, and then predicts those (thereby being able to achieve higher levels of predictive accuracy).
According to our earlier discussion of the nature of (il)legitimate value change, cases where performative power drives value change in a population constitute an example of illegitimate change. The people undergoing said change were in no meaningful way actively involved in the change that the performative predictor affected upon said population, and their ability to 'course-correct' was actively reduced by means of (among others) choice design (i.e., affecting the order of recommendations a consumer is exposed to) or by exploiting certain psychological features which make it such that some types of content are experienced as locally more compelling than others, irrespective of said content's relationship to the individuals' values or proleptic reasons.
What is more, the change that the population undergoes is shaped in such a way that it tends towards making the val...