The steppingstone’s potential can be seen by analogy with biological evolution. In nature, the tree of life has no overarching goal, and features used for one function might find themselves enlisted for something completely different. Feathers, for example, likely evolved for insulation and only later became handy for flight.
Biological evolution is also the only system to produce human intelligence, which is the ultimate dream of many AI researchers. Because of biology’s track record, Stanley and others have come to believe that if we want algorithms that can navigate the physical and social world as easily as we can—or better!—we need to imitate nature’s tactics. Instead of hard-coding the rules of reasoning, or having computers learn to score highly on specific performance metrics, they argue, we must let a population of solutions blossom. Make them prioritize novelty or interestingness instead of the ability to walk or talk. They may discover an indirect path, a set of steppingstones, and wind up walking and talking better than if they’d sought those skills directly.
New, Interesting, Diverse
After Picbreeder, Stanley set out to demonstrate that neuroevolution could overcome the most obvious argument against it: “If I run an algorithm that’s creative to such an extent that I’m not sure what it will produce,” he said, “it’s very interesting from a research perspective, but it’s a harder sell commercially.”
He hoped to show that by simply following ideas in interesting directions, algorithms could not only produce a diversity of results, but solve problems. More audaciously, he aimed to show that completely ignoring an objective can get you there faster than pursuing it. He did this through an approach called novelty search.
The system started with a neural network, which is an arrangement of small computing elements called neurons connected in layers. The output of one layer of neurons gets passed to the next layer via connections that have various “weights.” In a simple example, input data such as an image might be fed into the neural network. As the information from the image gets passed from layer to layer, the network extracts increasingly abstract information about its contents. Eventually, a final layer calculates the highest-level information: a label for the image.
In neuroevolution, you start by assigning random values to the weights between layers. This randomness means the network won’t be very good at its job. But from this sorry state, you then create a set of random mutations — offspring neural networks with slightly different weights — and evaluate their abilities. You keep the best ones, produce more offspring, and repeat. (More advanced neuroevolution strategies will also introduce mutations in the number and arrangement of neurons and connections.) Neuroevolution is a meta-algorithm, an algorithm for designing algorithms. And eventually, the algorithms get pretty good at their job.
To test the steppingstone principle, Stanley and his student Joel Lehman tweaked the selection process. Instead of selecting the networks that performed best on a task, novelty search selected them for how different they were from the ones with behaviors most similar to theirs. (In Picbreeder, people rewarded interestingness. Here, as a proxy for interestingness, novelty search rewarded novelty.)
In one test, they placed virtual wheeled robots in a maze and evolved the algorithms controlling them, hoping one would find a path to the exit. They ran the evolution from scratch 40 times. A comparison program, in which robots were selected for how close (as the crow flies) they came to the exit, evolved a winning robot only 3 out of 40 times. Novelty search, which completely ignored how close each bot was to the exit, succeeded 39 times. It worked because the bots managed to avoid dead ends. Rather than facing the exit and beating their heads against the wall, they explored unfamiliar territory, found workarounds, and won by accident. “Novelty search is important because it turned everything on its head,” said Julian Togelius, a computer scientist at New York University, “and basically asked what happens when we don’t have an objective.”