In the field of relational reinforcement learning - a representational generalisation of reinforcement learning - the first-order representation of environments results in a potentially infinite number of possible states, requiring learning agents to use some form of abstraction to learn effectively. Instead of forming an abstraction over the state-action space, an alternative technique is to create behaviour directly through policy-search. The algorithm named CERRLA presented in this paper uses the cross-entropy method to learn behaviour directly in the form of decision-lists of relation rules for solving problems in a range of different environments, without the need for expert guidance in the learning process. The behaviour produced by the algorithm is easy to comprehend and is biased towards compactness. The results obtained show that CERRLA is competitive in both the standard testing environment and in Ms. Pac-MAN and CARCASSONNE, two large and complex game environments.
|Title of host publication||Inductive Logic Programming|
|Subtitle of host publication||23rd International Conference, ILP 2013, Rio de Janeiro, Brazil, August 28-30, 2013, Revised Selected Papers|
|Number of pages||17|
|Publication status||Published - 2014|
|Series||Lecture Notes in Computer Science|