Contributed talk
in
Autonomous Evolution, Production and Learning in Robotic Eco-Systems 1,
July 30, 2019, noon
in room
USB.4.005
Reinforcement Learning Agents acquire Flocking and Symbiotic Behaviour in Simulated Ecosystems
Peter Sunehag, Siqi Liu, Guy Lever, Joel Leibo, Edward Hughes, Tom Eccles, Josh Merel, Nicolas Heess, Thore Graepel
watch
Publication
In nature, group behaviours such as flocking as well as cross-species symbiotic partnerships are observed in vastly different forms and circumstances. We hypothesize that such strategies can arise in response to generic predator-prey pressures in a spatial environment with range-limited sensation and action. We evaluate whether these forms of coordination can emerge by independent multi-agent reinforcement learning in simple multiple-species ecosystems. Further, we assess how theses patterns depend on the level of predator pressure and range of the sensing. In contrast to prior work, we avoid hand-crafted shaping rewards, specific actions, or dynamics that would directly encourage coordination across agents. Instead we test whether coordination emerges as a consequence of learning without encouraging these specific forms of coordination, which only has indirect benefit. Our simulated ecosystems consist of a generic food chain involving three trophic levels: apex predator, mid-level predator, and prey. We conduct experiments on two different platforms, a 3D physics engine with tens of agents as well as in a 2D grid world with up to thousands. The results clearly confirm our hypothesis and show substantial coordination both within and across species. To obtain these results, we leverage and adapt recent advances in deep reinforcement learning within an ecosystem training protocol featuring homogeneous groups of independent agents from different species (sets of policies), acting in many different random combinations in parallel habitats. The policies utilize neural network architectures that are invariant to agent individuality but not type (species) and that generalize across varying numbers of observed other agents. While the emergence of complexity in artificial ecosystems have long been studied in the artificial life community, the focus has been more on individual complexity and genetic algorithms and less on group complexity and reinforcement learning emphasized in this article.