Russell on AI in technocracy and surveillance
In chapter 4 of his book “Human Compatible”, Stuart Russell discusses various harmful applications of Artificial Intelligence (AI) technology, including the ways AI increasingly makes surveillance more effective, to the point where the “Stasi will look like amateurs by comparison.” We should point out that surveillance is not just “observing,” but is a method for controlling the behavior of people. Okay, so what’s new? Well, AI technology introduces new ways of controlling people. Because we live such a big part of our lives online nowadays (including our consumption of facts, news, our communication with peers etc.), “AI systems can track an individual’s online reading habits, preferences, and likely state of knowledge, they can tailor specific messages to maximize impact on that individual while minimizing the risk that the information will be disbelieved.” Forms of coercion can be made more effective by using personalized strategies (e.g. Russell mentions blackmail bots that use your online profile). A more subtle way to alter people’s behavior is to “modify their information environment so that they believe different things and make different decisions” (my emphasis). In addition to more accurate personalized profiles, AI systems can constantly adjust themselves to be more effective, based on feedback mechanisms such as mouse clicks or time spent reading.
After discussing new phenomena such as deep fakes, Russel makes a bit of a leap and contemplates the various ways in which governments (on the less friendly side of the spectrum) may use AI-supported surveillance to directly control its citizens by implementing “rewards and punishments based on behavior”. In a European context this section may sound slightly unrealistic, but it’s not that unrealistic if you consider a certain… Asian country using surveillance to experiment with evaluating civilians using point systems. If you are not worried about governments in particular, you may think of similar, but less severe and less obvious strategies used by big internet corporations, or the upcoming era of health-related apps. One example I’m thinking of is insurance companies given you a lower health insurance bounty when you use smart watches to show you are moving enough per day (certainly important, but a very specific, limited, unilateral concept of health).
In any case, what these examples share is that “such a system treats people as reinforcement learning algorithms, training them to optimize the objective set by the state” (my emphasis, and again: you could perhaps substitute “state” by your favorite big evil corporation). So not only does the technology build up profiles based on behavioral feedback, the humans themselves, Russell suggests, will be evaluated with a scoring function, as if they themselves work like these algorithms.
What I appreciate about the upcoming argument is that Russell specifically targets those states/companies with a “top-down, engineering mind-set”, which is at first sight quite reasonable and which I suspect to be pervasive. It can also the mindset of techno-optimists with good intentions. But then again, however extreme the surveillance, it’s always for “the greater good”, so good intentions only go so far. Technocracy comes into the picture when we ask exactly who decides what is good, and how we measure that?
Russell reconstructs this engineering-style reasoning as follows, with respect to governments:
- “it would be better if everyone behaved well, had a patriotic attitude, and contributed to the progress of the country”
- “technology enables measurement of individual behavior, attitudes, and contributions”
- “therefore, everyone will be better off if we set up a technology-based system of monitoring and control based on rewards and punishments.”
Or you can come up with some alternative version, like: it is healthier to not drink alcohol and better to have less alcohol-related violence; we have studied and understood the effects of alcohol; therefore it is better if we monitor everyone, punish those who drink alcohol, and reward superfood-munging yoga hipsters (is that a thing?).
Then Russell provides three arguments against this top-down engineering technocracy mindset:
“First, it ignores the psychic cost of living under a system of intrusive monitoring and coercion; outward harmony masking inner misery is hardly an ideal state. Every act of kindness ceases to be an act of kindness and becomes instead an act of personal score maximization and is perceived as such by the recipient. Or worse, the very concept of a voluntary act of kindness gradually becomes just a fading memory of something people used to do. Visiting an ailing friend in hospital will, under such a system, have no more moral significance and emotional value than stopping at a red light.
To stick with Russel’s example: I think the issue here is not that there would no longer be acts of kindness; but I’m wondering how one would recognize them as such in this (hypothetical?) state. Under this regime, the only way to “prove” to its recipient that something is in fact an act of kindness under the all-seeing eye of a scoring metric, would be to show 1) that you understand how your actions are evaluated and 2) then act against the maxim of score maximization. It would not be sufficient to be altruistic and selfless, a kind act would have to be self-destructive (or one might say, extremely altruistic). Perhaps only then the receiver would see rationally, without relying on empathy and good faith, that some intrinsic moral and human value is the best explanation for the shown behavior, rather than score optimization. That’s of course a hypothetical reflection under the assumption of all-pervasive surveillance; otherwise another option would be to communicate covertly.
The paradox of this extreme example is that by trying to optimize desirable human behavior (e.g. visiting that ailing friend) and by quantifying the human values that they promote (e.g. kindness), you lose the quality of what is sought after, similar to the capitalist perversion of friendship or love when they become part of an economy of (monetary) exchange. Friendship or love cannot be part of a contract, because that would imply you can demand some utility from the other and enforce this demand. (I am not denying that friendships and love relationships can have and in fact do have utility. I am rather saying that only the cynical and the sociopathological would argue they are about utility and, as a consequence, can be quantified. But hey, perhaps I’m too romantic.)
In other words, the deeper issue that Russell addresses with this extreme example is that the true objective would not be captured by any explicitly stated objective. This is a fundamental problem of what Russell calls the “standard model” of AI, where “intelligent” machines are optimizing a “purpose put in the machine”:
Second, the scheme falls victim to the same failure mode as the standard model of AI, in that it assumes that the stated objective is in fact the true, underlying objective. Inevitably, Goodhart’s law will take over, whereby individuals optimize the official measure of outward behavior, just as universities have learned to optimize the “objective” measures of “quality” used by university ranking systems instead of improving their real (but unmeasured) quality.
I was unfamiliar with Goodhart’s law, but Russell’s explanation is exceptionally clear (and the university example is awfully accurate). Marilyn Strathern’s paraphrase is elucidating: “When a measure becomes a target, it ceases to be a good measure.” Russell points out that this problem applies when humans design intelligent machines and algorithms to minimize some loss function (the standard model of AI), and he provides many examples where AI optimizing towards some objective has adverse effects. This problem is just as bad when we give humans the algorithmic treatment, so to say. In economics and game theory the issue is people gaming the measure, which may in fact be to the detriment of what you are trying to promote.
Finally, the imposition of a uniform measure of behavioral virtue misses the point that a successful society may comprise a wide variety of individuals, each contributing in their own way.”
The latter point is relatively self-evident and adds to the earlier reasoning: can you capture the quality you are trying to promote by subjugating people to a relatively uniform quantitative measure?
Russel’s book in particular deals with the problem of specifying objectives in AI technology designed to optimize towards a given goal. But you could perhaps generalize his argument to technocracy in general: technocracy firstly assumes you have an (engineer) class of people that know what is best, which is both optimistic and paternalistic, but secondly, even when they do know what is best, it is quite a fundamental issue whether you can also translate that into procedures that truly achieve the intended result.
N.B. I’m reading this book in EPUB format so I can’t quite add page references. All citations can be found in Chapter 4, Surveillance, Persuasion, and Control, section “Controlling your behavior”