For my Master’s Thesis I worked on Multi-Task Reinforcement Learning. The resulting publication is a joint work with Oliver Richter and Roger Wattenhofer.
In Multi-Task RL, we want to train an agent that is able to achieve multiple different goals. A common issue in Multi-Task RL is that the different tasks interfere with each other, a concept that is also known as negative transfer.
We proposed a new way to avoid this issue by clustering tasks into sets of similar tasks.
To do so, we use a set of policies and propose an EM-inspired approach that consists of two iterative steps:
- Evaluate all policies on all tasks
- Assign each task to the policy that performed best on it
- Train each policy on its assigned tasks
We evaluated the approach on a broad set of environments:
- Simple discrete toy tasks
- Simple continuous control tasks
- Complex Bipedal Walker control tasks
- Atari games
We showed that our approach is able to meaningfully cluster tasks together and thereby improve sample complexity and/or final performance.