An automated measure of mdp similarity for transfer in reinforcement learning

Haitham Bou Ammar, Kurt Driessens, Eric Eaton, Matthew E. Taylor, Decebal Constantin Mocanu, Gerhard Weiss, Karl Tuyls

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademic

Abstract

Transfer learning can improve the reinforcement learning of a new task by allowing the agent to reuse knowledge acquired from other source tasks. Despite their success, transfer learning methods rely on having relevant source tasks; transfer from inappropriate tasks can inhibit performance on the new task. For fully autonomous transfer, it is critical to have a method for automatically choosing relevant source tasks, which requires a similarity measure between Markov Decision Processes (MDPs). This issue has received liule attention, and is therefore still a largely open problem. This paper presents a data-driven automated similarity measure for MDPs. This novel measure is a significant .step toward autonomous reinforcement learning transfer, allowing agents to: (1) characterize when transfer will be useful and, (2) automatically select tasks to use for transfer The proposed measure is based on the reconstruction error of a restricted Boltzmann machine that attempts to model the behavioral dynamics of the two MDPs being compared. Hmpirical results illustrate that this measure is correlated with the performance of transfer and therefore can be used to identify similar source tasks for transfer learning.

Original languageEnglish
Title of host publicationMachine Learning for Interactive Systems: Papers from the AAAI-14 Workshop
Publication statusPublished - 2014

Fingerprint

Dive into the research topics of 'An automated measure of mdp similarity for transfer in reinforcement learning'. Together they form a unique fingerprint.

Cite this