Overview of Emergent and Novel Behavior in AI Systems

March 26, 2024

Overview of Emergent and Novel Behavior in AI Systems

A system has “emergent” behavior if the system suddenly develops a significant new capability or character after a relatively small and gradual change in some of the system’s parts or features.

For instance, cold water, warm water, and hot water all mostly stay put inside their containers, but if you keep increasing water’s temperature past its boiling point, then all of a sudden water undergoes a phase change into gas, which can and will escape its container. Another example is the deadly “flashover” phenomenon in firefighting: when excessive heat builds up in a room, the entire space can burst into flames at once. Similarly, a few carpenter ants are harmless, but a colony of them can synergize to destroy your house.

AI systems can also display emergent behavior. Researchers find that incrementally increasing the parameters, training data, and computational resources available to an AI system may lead to the relatively rapid emergence of new capabilities, such as:

  • figuring out how to do a novel task based on information in the prompt,
  • solving multi-step reasoning problems when encouraged to think step by step,
  • evolving from solving only the problems it can see to “grokking” the task and finding a general method that works on all similar problems,
  • tracking and valuing abstract or complex concepts that are not explicitly represented in the AI’s training data; e.g., AlphaZero’s rapid shift to valuing king safety after 32,000 steps of training in chess.

More generally, AI capabilities have consistently expanded over time. This advancement—which is significantly fueled by simply feeding more computational resources to AI systems—has led to the development of diverse new AI capabilities. For instance, OpenAI was excited in 2018 that GPT-1 could answer extremely basic questions in English, whereas today, significantly larger language models like GPT-4 can translate Chinese, write poetry, play chess, pass Turing tests, diagnose diseases, debug code, and explain jokes. The speed of this progress surprised many experts.

In short, it is not safe to assume that forthcoming AI systems will be just like current AI systems, only slightly more powerful. Instead, policymakers should assume that increasingly advanced AI systems will possess new kinds of abilities, including dangerous capabilities that are mostly absent in today’s best systems.