Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games

Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games

Peng Peng (1), Quan Yuan (1), Ying Wen (2), Yaodong Yang (2), Zhenkun Tang (1), Haitao Long (1), Jun Wang (2)

(1) Alibaba Group, (2) University College London

(arXiv:1703.10069, Submitted on 29 Mar 2017)

Real-world artificial intelligence (AI) applications often require multiple agents to work in a collaborative effort. Efficient learning for intra-agent communication and coordination is an indispensable step towards general AI. In this paper, we take StarCraft combat game as the test scenario, where the task is to coordinate multiple agents as a team to defeat their enemies. To maintain a scalable yet effective communication protocol, we introduce a multiagent bidirectionally-coordinated network (BiCNet [‘bIknet]) with a vectorised extension of actor-critic formulation. We show that BiCNet can handle different types of combats under diverse terrains with arbitrary numbers of AI agents for both sides. Our analysis demonstrates that without any supervisions such as human demonstrations or labelled data, BiCNet could learn various types of coordination strategies that is similar to these of experienced game players. Moreover, BiCNet is easily adaptable to the tasks with heterogeneous agents. In our experiments, we evaluate our approach against multiple baselines under different scenarios; it shows state-of-the-art performance, and possesses potential values for large-scale real-world applications.

用于学习玩星际争霸战的多重双向协调网

现实世界人工智能(AI)应用程序通常需要多个代理人协同工作。代理人之间的沟通和协调的有效学习是向一般人工智能迈进的不可或缺的一步。在本文中,我们以星际争霸战作为测试场景,其任务是协调多个代理人作为一个团队来打败他们的敌人。为了保持可扩展但有效的通信协议,我们引入了一个多代理双向协调网络(BiCNet[‘bIknet]),其具有向量化的演员评论者的扩展。我们可以看出,BiCNet可以处理不同地形下的不同类型的战斗,双方都有任意数量的AI代理。我们的分析表明,如果没有任何诸如人类示范或标签数据的监管,BiCNet可以学习与经验丰富的游戏玩家相似的各种类型的协调策略。此外,BiCNet很容易适应异构代理的任务。在我们的实验中,我们根据不同的场景评估我们针对多个基线的方法;它展示了最先进的性能,并且具有大规模现实世界应用的潜在价值。


Acknowledgments

冯炀赫

Phone

07318457661

Address

National University of Defense Tecnology
Changsha, Hunan 410073
China