Awesome Papers: 2017-02-4

ACTOR-MIMIC: DEEP MULTITASK AND TRANSFER REINFORCEMENT LEARNING

Emilio Parisotto, Jimmy Ba, Ruslan Salakhutdinov

The ability to act in multiple environments and transfer previous knowledge to new situations can be considered a critical aspect of any intelligent agent. Towards this goal, we define a novel method of multitask and transfer learning that enables an autonomous agent to learn how to behave in multiple tasks simultaneously, and then generalize its knowledge to new domains. This method, termed “Actor-Mimic”, exploits the use of deep reinforcement learning and model compression techniques to train a single policy network that learns how to act in a set of distinct tasks by using the guidance of several expert teachers. We then show that the representations learnt by the deep policy network are capable of generalizing to new tasks with no prior expert guidance, speeding up learning in novel environments. Although our method can in general be applied to a wide range of problems, we use Atari games as a testing environment to demonstrate these methods.

ACTOR-MIMIC:深度多任务和转移强化学习

能够在多个环境和先前的知识转移到新的情况可以被认为是任何智能代理的一个关键方面。朝着这一目标,我们定义了一个多任务和转移学习的新方法,使一个自主代理学习如何在多个任务中同时进行,然后概括形成新领域的知识。这种方法,称为“Actor-Mimic”,是利用强化学习和模型压缩技术训练一个策略网络,学习如何通过使用多个expert teachers在一组不同的任务中进行。然后我们展示了深度策略网络能够概括知识到新任务在之前没有expert guidance的情况下,加快了在新环境中学习。虽然我们的方法可以应用于广泛的一般问题,我们使用Atari游戏作为一个测试环境来演示这些方法。

在本文中,作者定义一个新的方法:Actor-Mimic,用于通过一组相关的任务来源来训练一个深度策略网络。文中表明,使用Actor-Mimic进行网络训练能够使许多游戏得分在性能上同时达到专家水平,虽然只拥有一个相同的复杂模型。此外,使用Actor-Mimic作为多任务预训练阶段可以显著提高一组目标任务的学习速度。这表明特征学习在源任务上可以概括为新目标任务,只要在源和目标任务之间给予足够的相似度。未来工作的方向是开发一个方法,可以使目标知识转移通过从识别给定目标任务的相关源任务实现源任务转移。使用有针对性的知识转移可以在我们的实验观察到的负迁移的情况下提供潜在帮助。


Download

链接到百度云盘


Acknowledgments

王琦

Phone

07318457661

Address

National University of Defense Tecnology
Changsha, Hunan 410073
China