Task Clustering in RL

We looked at clustering tasks for Multi-Task RL and proposed a new method.

For my Master’s Thesis I worked on Multi-Task Reinforcement Learning. The resulting publication is a joint work with Oliver Richter and Roger Wattenhofer.

In Multi-Task RL, we want to train an agent that is able to achieve multiple different goals. A common issue in Multi-Task RL is that the different tasks interfere with each other, a concept that is also known as negative transfer.

We proposed a new way to avoid this issue by clustering tasks into sets of similar tasks.

To do so, we use a set of policies and propose an EM-inspired approach that consists of two iterative steps:

  • E-Step:
    • Evaluate all policies on all tasks
    • Assign each task to the policy that performed best on it
  • M-Step:
    • Train each policy on its assigned tasks

We evaluated the approach on a broad set of environments:

  • Simple discrete toy tasks
  • Simple continuous control tasks
  • Complex Bipedal Walker control tasks
  • Atari games

We showed that our approach is able to meaningfully cluster tasks together and thereby improve sample complexity and/or final performance.