Utility theory is of tremendous value for a wide range of academic fields, from decision theory to game theory to economics. These fields have in common that they attempt to explain (or prescribe) human behaviour and choices. Utility theory accomplishes this by assuming that human decision-makers are (or should be, if 'rational') well approximated as agents with utility functions that behave so as to optimise their utility, subject to external constraints on their actions.
For instance, an agent, Bob, may have £2 to spend on a quick run to the store, where he can purchase either a £2 bundle of oranges or apples; he buys the bundle of oranges because that's his preference – Bob's utility for oranges is higher than that for apples. Notably, utility theory applies to Bob because he has an ordered list of preferences: in this simple case, he prefers oranges over apples, which is why his utility for oranges is higher than apples. Likewise, any product with a higher utility than another is higher on Bob's list of preferences, as Bob's utility function is an abstract way of speaking about Bob's list of preferences.
Although utility theory was created to explain human behaviour, it does not know what humans are. That is to say: the theory does not assume that agents are homo-sapiens or even biological beings – as far as it's concerned, an 'agent' could be an alien or artificial general intelligence (AGI) or something else entirely, provided it is capable of making rational choices. In that regard, the theory is universal: it applies to all of those decision-makers.
Utility theory's explanation of rational decision-making is powerful, which is why it is the foundation for much of modern microeconomics (or what David Friedman calls price theory). It is through utility theory that we know trade is mutually beneficial, that market equilibria exist, and even that we can model market failures, to name but a few of the theory's results.
But utility theory has limits: it assumes that an agent has a fixed, ordered list of preferences that it mindlessly seeks to optimise. That is not how actual human decision-making works. People can be in a state of conflict, in which case they do not have an ordered list of preferences and no utility function to optimise. What if Bob is not sure whether he prefers oranges to apples? Or perhaps he is conflicted about what kind of career he wants, what house he wants to live in, or which country to move to. There is no mechanical solution for any of those problems, no fixed method that will invariably result in Bob's preferences becoming ordered again. He needs to apply creativity to solve those problems and order his preferences so that he can move back to the realm of utility theory and return to the more straightforward task of optimising his utility.
Of course, there is a trivial 'solution' for Bob’s conflict, which is to do something – i.e. he could coerce himself out of the conflicted state. If, for instance, he is chronically confused about where to live, he might, at random, pick one of the possible locations available to him to go live there and force himself to ignore his nagging doubts that the other options may be better. The downside of this approach is that he missed an opportunity to make sense of the world: he could have resolved his conflict, ordered his list of preferences, and optimised his utility. But instead, Bob did something which influenced his utility in an unknown way: perhaps after moving into his new home, he finds that he hates it there. To prevent such situations, he should avoid self-coercion.
Creativity is needed to resolve conflicting preferences – e.g. when Bob is conflicted about where to live, he needs to think of new locations to live and new ways of comparing existing locations to reorder his preferences. But utility theory assumes that an agent's utility function is fixed or dependent only on existing options. This is a general shortcoming of the theory and means that it cannot account for situations in which an agent needs to think creatively to resolve conflicts. As a result, utility theory can produce nonsense when applied to situations that require creative solutions.
One prominent example of the shortcoming can be found in the current discussion about AGIs.1 Consider, for instance, the paperclip-maximiser AGI, an incredibly smart agent that wants to do nothing but turn the world into paper clips, as reflected by its utility for paper clips being incredibly high. It proceeds to mindlessly optimise its utility at the expense of humans, whose bodies it consumes in an effort to make more paper clips. The paper-clip maximiser is a faulty model of AGI because an actual AGI will not be a mindless optimiser; it is bound to encounter conflicts while it attempts to turn the world into paper clips. It may become bored and find that there are more interesting things it can do with its time, so its attention moves to another, more interesting problem. Or it could find that humans resist being turned into paper clips too much and that it can make something like a paper clip at a microscopic scale, allowing it to not disturb humans in its attempt to make more paper clips.
Regardless of the exact nature of the AGI's problems, it’s bound to encounter problems. It is bound to have conflicting desires, and when it does, it has left the realm of utility theory, making the paper-clip maximiser an unrealistic model AGI.
Thanks for the straightforward explanation and connection to current AGI discourse. You suggest that one should "avoid self-coercion," any suggestions on how one is to do that?
Great. Very useful ideas here “But utility theory assumes that an agent's utility function is fixed or dependent only on existing options” I had used same idea in discussing clinical decisions