In this paper, we investigate reinforcement learning (rl) in multi-agent systems (mas) from an evolutionary dynamical perspective. Typical for a mas is that the environment is not stationary and the markov property is not valid. This requires agents to be adaptive. Rl is a natural approach to model the learning of individual agents. These learning algorithms are however known to be sensitive to the correct choice of parameter settings for single agent systems. This issue is more prevalent in the mas case due to the changing interactions amongst the agents. It is largely an open question for a developer of mas of how to design the individual agents such that, through learning, the agents as a collective arrive at good solutions. We will show that modeling rl in mas, by taking an evolutionary game theoretic point of view, is a new and potentially successful way to guide learning agents to the most suitable solution for their task at hand. We show how evolutionary dynamics (ed) from evolutionary game theory can help the developer of a mas in good choices of parameter settings of the used rl algorithms. The ed essentially predict the equilibriums outcomes of the mas where the agents use individual rl algorithms. More specifically, we show how the ed predict the learning trajectories of q-learners for iterated games. Moreover, we apply our results to (an extension of) the collective intelligence framework (coin). Coin is a proved engineering approach for learning of cooperative tasks in mass. The utilities of the agents are re-engineered to contribute to the global utility. We show how the improved results for mas rl in coin, and a developed extension, are predicted by the ed.
Tuyls, K. P., 't Hoen, P. J., & Vanschoenwinkel, B. (2006). An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games. Autonomous Agents and Multi-agent Systems, 12(1), 115-153. https://doi.org/10.1007/s10458-005-3783-9