Task Clustering in RL | Johannes Ackermann

The resulting paper has been accepted for publication at ECML-PKDD 2021. Find the links here.

For my Master’s Thesis I worked on Multi-Task Reinforcement Learning. The resulting publication is a joint work with Oliver Richter and Roger Wattenhofer.

In Multi-Task RL, we want to train an agent that is able to achieve multiple different goals. A common issue in Multi-Task RL is that the different tasks interfere with each other, a concept that is also known as negative transfer.

We proposed a new way to avoid this issue by clustering tasks into sets of similar tasks.

To do so, we use a set of policies and propose an EM-inspired approach that consists of two iterative steps:

E-Step:
- Evaluate all policies on all tasks
- Assign each task to the policy that performed best on it
M-Step:
- Train each policy on its assigned tasks

We evaluated the approach on a broad set of environments:

Simple discrete toy tasks
Simple continuous control tasks
Complex Bipedal Walker control tasks
Atari games

We showed that our approach is able to meaningfully cluster tasks together and thereby improve sample complexity and/or final performance.