Multiagent Online Learning in Time-Varying Games

B. Duvocelle, P. Mertikopoulos, M. Staudigl*, D. Vermeulen

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

234 Downloads (Pure)

Abstract

We examine the long-run behavior of multiagent online learning in games that evolve over time. Specifically, we focus on a wide class of policies based on mirror descent, and we show that the induced sequence of play (a) converges to a Nash equilibrium in time-varying games that stabilize in the long run to a strictly monotone limit, and (b) it stays asymptotically close to the evolving equilibrium of the sequence of stage games (assuming they are strongly monotone). Our results apply to both gradient- and payoffbased feedback - that is, when players only get to observe the payoffs of their chosen actions.
Original languageEnglish
Pages (from-to)914-941
Number of pages29
JournalMathematics of Operations Research
Volume48
Issue number2
Early online date1 Jul 2022
DOIs
Publication statusPublished - May 2023

Keywords

  • dynamic regret
  • Nash equilibrium
  • mirror descent
  • time-varying games
  • STOCHASTIC-APPROXIMATION
  • OPTIMIZATION
  • DYNAMICS
  • CONVERGENCE
  • GRADIENT
  • DESCENT
  • PLAY
  • FORM

Cite this