Awesome Papers: 2016-12-3

Deep Convolutional Neural Network Design Patterns

Leslie N. Smith, Nicholay Topin

Recent research in the deep learning field has produced a plethora of new architectures. At the same time, a growing number of groups are applying deep learning to new applications.Some of these groups are likely to be composed of inexperienced deep learning ractitioners who are baffled by the dizzying array of architecture choices and therefore opt to use an older architecture (i.e., Alexnet). Here we attempt to bridge this gap by mining the collective knowledge contained in recent deep learning research to discover underlying principles for designing neural network architectures. In addition, we describe several architectural innovations, including Fractal of FractalNet network, Stagewise Boosting Networks, and Taylor Series Networks (our Caffe code and prototxt files will be made publicly available). We hope others are inspired to build on our preliminary work.

深度卷积神经网络设计模式

最近在深度学习领域的研究产生了大量的新架构。同时,越来越多的团体正在将深度学习应用到新的应用程序中。这些团体中有的可能由没有经验的深度学习者组成,他们被令人眼花缭乱的架构选择所困扰,因此通常使用比较旧的架构(即Alexnet)。在这里,我们尝试通过挖掘最近深度学习研究中包含的集体知识来弥合这一差距,来发现设计神经网络架构的基本原则。此外,我们还描述了几种架构新观念,包括分形网络,分段升压网络和泰勒系列网络(我们的Caffe代码和prototxt文件将公开可用)。我们希望其他人在我们的初步工作基础上 受到鼓舞。


TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games

Gabriel Synnaeve, Nantas Nardelli, Alex Auvolat, Soumith Chintala, Timothée Lacroix, Zeming Lin, Florian Richoux, Nicolas Usunier

We present TorchCraft, a library that enables deep learning research on Real-Time Strategy (RTS) games such as StarCraft: Brood War, by making it easier to control these games from a machine learning framework, here Torch. This white paper argues for using RTS games as a benchmark for AI research, and describes the design and components of TorchCraft.

TorchCraft:实时战略游戏机器学习研究库

我们提出了 TorchCraft,这是一个能让深度学习在即时战略类游戏(RTS)上进行研究的库,比如星际争霸:Brood War。通过从机器学习框架控制这些游戏从而使玩游戏变得更简单,这篇论文中使用的框架是 Torch。这篇论文支持使用 RTS 类游戏作为人工智能研究的一项基准,并描述了 TorchCraft 的设计和构成。


Adversarial Feature Learning

Jeff Donahue, Philipp Krähenbühl, Trevor Darrell

The ability of the Generative Adversarial Networks (GANs) framework to learn generative models mapping from simple latent distributions to arbitrarily complex data distributions has been demonstrated empirically, with compelling results showing generators learn to “linearize semantics” in the latent space of such models. Intuitively, such latent spaces may serve as useful feature representations for auxiliary problems where semantics are relevant. However, in their existing form, GANs have no means of learning the inverse mapping – projecting data back into the latent space. We propose Bidirectional Generative Adversarial Networks (BiGANs) as a means of learning this inverse mapping, and demonstrate that the resulting learned feature representation is useful for auxiliary supervised discrimination tasks, competitive with contemporary approaches to unsupervised and self-supervised feature learning.

对抗特征学习

生成式对抗网络(GAN)框架学习从简单潜在分布到任意复杂数据分布的生成模型的能力已经强有力的表明,生成器在这种模型的潜在空间中学会“线性化语义”。直观地说,这样的潜在空间可以用作相关语义的辅助问题的有用特征表示。然而,在其现有形式中,GAN没有学习逆映射的手段——将数据投影回潜在空间。我们建议双向生成式对抗网络(BiGANs)作为学习这种逆映射的一种方法,并且证明与现代无监督和自监督特征学习的方法相比较,所得到的学习特征表示对于辅助监督区分任务是有用的。


RL[Math Processing Error]: Fast Reinforcement Learning via Slow Reinforcement Learning

Yan Duan, John Schulman, Xi Chen, Peter L. Bartlett, Ilya Sutskever, Pieter Abbeel

Deep reinforcement learning (deep RL) has been successful in learning sophisticated behaviors automatically; however, the learning process requires a huge number of trials. In contrast, animals can learn new tasks in just a few trials, benefiting from their prior knowledge about the world. This paper seeks to bridge this gap. Rather than designing a “fast” reinforcement learning algorithm, we propose to represent it as a recurrent neural network (RNN) and learn it from data. In our proposed method, RL[Math Processing Error], the algorithm is encoded in the weights of the RNN, which are learned slowly through a general-purpose (“slow”) RL algorithm. The RNN receives all information a typical RL algorithm would receive, including observations, actions, rewards, and termination flags; and it retains its state across episodes in a given Markov Decision Process (MDP). The activations of the RNN store the state of the “fast” RL algorithm on the current (previously unseen) MDP. We evaluate RL[Math Processing Error] experimentally on both small-scale and large-scale problems. On the small-scale side, we train it to solve randomly generated multi-arm bandit problems and finite MDPs. After RL[Math Processing Error] is trained, its performance on new MDPs is close to human-designed algorithms with optimality guarantees. On the large-scale side, we test RL[Math Processing Error] on a vision-based navigation task and show that it scales up to high-dimensional problems.

RL[数学处理错误]:通过缓慢强化学习的快速强化学习

深度强化学习(deep RL)已经能成功地自动学习复杂行为;然而,学习过程需要大量的试验。相比之下,动物可以在仅仅几个试验中学习新的任务,从他们之前对世界了解的知识中受益。本文试图弥合这一差距。相比设计一个“快速”强化学习算法,我们建议将其表示为一个递归神经网络(RNN),并从数据中学习。在我们提出的方法RL[数学处理误差]中,该算法被编码在通过一般目的(“慢”)RL算法缓慢学习的RNN的权重中。RNN接收典型RL算法将接收的所有信息,包括观察、动作、奖励和终止信号;并且它能在给定的马尔科夫决策过程(MDP)中跨片段保持其状态。RNN的激活存储了“快速”RL算法在当前(先前未看见)MDP的状态。我们在实验上对小规模和大规模问题进行了RL[数学处理误差]评估。在小规模方面,我们训练它来解决随机生成的multi-arm bandit问题和有限的MDPs。在训练RL[数学处理误差]之后,其对新MDPs的性能接近于具有最优性保证的人为设计的算法。在大规模方面,我们在基于视觉的导航任务上测试RL[数学处理误差],表明它可以扩展到高维问题。


Node-Adapt, Path-Adapt and Tree-Adapt:Model-Transfer Domain Adaptation for Random Forest

Azadeh S. Mozafari, David Vazquez, Mansour Jamzad, Antonio M. Lopez

Random Forest (RF) is a successful paradigm for learning classifiers due to its ability to learn from large feature spaces and seamlessly integrate multi-class classification, as well as the achieved accuracy and processing efficiency. However, as many other classifiers, RF requires domain adaptation (DA) provided that there is a mismatch between the training (source) and testing (target) domains which provokes classification degradation. Consequently, different RF-DA methods have been proposed, which not only require target-domain samples but revisiting the source-domain ones, too. As novelty, we propose three inherently different methods (Node-Adapt, Path-Adapt and Tree-Adapt) that only require the learned source-domain RF and a relatively few target-domain samples for DA, i.e. source-domain samples do not need to be available. To assess the performance of our proposals we focus on image-based object detection, using the pedestrian detection problem as challenging proof-of-concept. Moreover, we use the RF with expert nodes because it is a competitive patch-based pedestrian model. We test our Node-, Path- and Tree-Adapt methods in standard benchmarks, showing that DA is largely achieved.

节点适应,路径适应和树适应:随机森林的模型迁移域适应

随机森林(RF)是一种用于学习分类器的成功范例,因为它具有从大特征空间学习和无缝集成多类分类的能力,以及实现的准确性和处理效率。然而,如同许多其他分类器,RF需要域适应(DA),只要在训练(源)和测试(目标)域之间存在不匹配,就好引起分类毁坏。因此,提出了不同的RF-DA方法,其不仅需要目标域样本,还要重新访问源域样本。作为新事物,我们提出三种本质不同的方法(节点适应、路径适应和树适配),它仅需要所学习的源域RF和用于DA的相对较少的目标域样本,即不需要用源域样本。为了评估我们的建议的性能,我们专注于基于图像的目标检测,使用行人检测问题作为挑战性的概念验证。此外,我们还使用了有专家节点的RF,因为它是一个竞争性的基于补丁的行人模型。我们在标准基准测试我们的节点适应、路径适应和树适配方法,结果显示DA在很大程度上都实现了。


Tuning Recurrent Neural Networks with Reinforcement Learning

Natasha Jaques, Shixiang Gu, Richard E. Turner, Douglas Eck

Sequence models can be trained using supervised learning and a next-step prediction objective. This approach, however, suffers from known failure modes. For example, it is notoriously difficult to ensure multi-step generated sequences have coherent global structure. Motivated by the fact that reinforcement learning (RL) can be used to impose arbitrary properties on generated data by choosing appropriate reward functions, in this paper we propose a novel approach for sequence training which combines Maximum Likelihood (ML) and RL training. We refine a sequence predictor by optimizing for some imposed reward functions, while maintaining good predictive properties learned from data. We propose efficient ways to solve this by augmenting deep Q-learning with a cross-entropy reward and deriving novel off-policy methods for RNNs from stochastic optimal control (SOC). We explore the usefulness of our approach in the context of music generation. An LSTM is trained on a large corpus of songs to predict the next note in a musical sequence. This Note-RNN is then refined using RL, where the reward function is a combination of rewards based on rules of music theory, as well as the output of another trained Note-RNN. We show that by combining ML and RL, this RL Tuner method can not only produce more pleasing melodies, but that it can significantly reduce unwanted behaviors and failure modes of the RNN.

调整递归神经网络与强化学习

通常可以使用监督学习和下一步预测目标来训练序列模型。然而,这种方法遭受已知的故障模式。例如,众所周知难以确保多步生成的序列具有清晰的全局结构。由于强化学习(RL)可以用于通过选择适当的奖励函数对生成的数据施加任意性质的事实,在本文中,我们提出了一种结合最大似然(ML)和RL训练的序列训练的新方法。我们通过优化一些强制的奖励函数来改善序列预测器,同时保持从数据中学习的良好预测属性。通过增强有交叉熵奖励的深度Q学习与和从随机最优控制(SOC)推导RNN的新型非策略方法,我们提出了有效的方法来解决这个问题。我们探索了我们的方法在音乐生成的上下文中的有用性。LSTM通过在大量歌曲上训练来预测音乐序列中的下一个音符。然后使用RL来修正这个注释RNN,其中奖励函数是基于音乐理论规则的奖励以及另一训练的注释RNN的输出的组合。我们表明,通过组合ML和RL,这种RL调谐器方法不仅可以产生更令人愉快的旋律,而且它可以显着减少不必要的行为和RNN的故障模式。


Incremental Sequence Learning

Edwin D. de Jong

Deep learning research over the past years has shown that by increasing the scope or difficulty of the learning problem over time, increasingly complex learning problems can be addressed. We study incremental learning in the context of sequence learning, using generative RNNs in the form of multi-layer recurrent Mixture Density Networks. While the potential of incremental or curriculum learning to enhance learning is known, indiscriminate application of the principle does not necessarily lead to improvement, and it is essential therefore to know which forms of incremental or curriculum learning have a positive effect. This research contributes to that aim by comparing three instantiations of incremental or curriculum learning. We introduce Incremental Sequence Learning, a simple incremental approach to sequence learning. Incremental Sequence Learning starts out by using only the first few steps of each sequence as training data. Each time a performance criterion has been reached, the length of the parts of the sequences used for training is increased. We introduce and make available a novel sequence learning task and data set: predicting and classifying MNIST pen stroke sequences. We find that Incremental Sequence Learning greatly speeds up sequence learning and reaches the best test performance level of regular sequence learning 20 times faster, reduces the test error by 74%, and in general performs more robustly; it displays lower variance and achieves sustained progress after all three comparison methods have stopped improving. The other instantiations of curriculum learning do not result in any noticeable improvement. A trained sequence prediction model is also used in transfer learning to the task of sequence classification, where it is found that transfer learning realizes improved classification performance compared to methods that learn to classify from scratch.

增量序列学习

过去几年的深度学习研究表明,随着时间的推移,通过增加学习问题的范围或难度,可以解决越来越复杂的学习问题。我们在序列学习的上下文中使用生成性RNN以多层循环混合密度网络的形式研究增量学习。虽然渐进式或课程学习增强学习的潜力是已知的,但是不加区别地应用这一原则并不一定导致进步,因此必须知道哪些形式的增量或课程学习具有积极的效果。这项研究通过比较增量或课程学习的三个实例来促进这一目标。 我们介绍了增量序列学习,它是一个简单的序列学习的增量方法。增量序列学习首先仅使用每个序列的前几个步骤作为训练数据。每次达到性能标准时,用于训练的序列部分的长度会增加。 我们介绍和提供一个新的序列学习任务和数据集:预测和分类MNIST笔笔画序列。我们发现增量序列学习极大地加快了序列学习,并达到了比正常序列学习的最佳测试性能水平快20倍,将测试误差降低74%,并且通常执行起来更鲁棒;在所有三种比较方法都停止改进之后,它表现出较低的方差并实现了持续的进步。课程学习的其他实例不会导致任何显着的改进。训练序列预测模型也在迁移学习中用于序列分类的任务,发现与学习从头分类的方法相比,迁移学习实现了改进的分类性能。


Heter-LP: A heterogeneous label propagation algorithm and its application in drug repositioning

Maryam Lotfi Shahreza, Nasser Ghadiri, Seyed Rasul Mossavi, Jaleh Varshosaz, James Green

Drug repositioning offers an effective solution to drug discovery, saving both time and resources by finding new indications for existing drugs. Typically, a drug takes effect via its protein targets in the cell. As a result, it is necessary for drug development studies to conduct an investigation into the interrelationships of drugs, protein targets, and diseases. Although previous studies have made a strong case for the effectiveness of integrative network-based methods for predicting these interrelationships, little progress has been achieved in this regard within drug repositioning research. Moreover, the interactions of new drugs and targets (lacking any known targets and drugs, respectively) cannot be accurately predicted by most established methods. In this paper, we propose a novel semi-supervised heterogeneous label propagation algorithm named Heter-LP, which applies both local as well as global network features for data integration. To predict drug-target, disease-target, and drug-disease associations, we use information about drugs, diseases, and targets as collected from multiple sources at different levels. Our algorithm integrates these various types of data into a heterogeneous network and implements a label propagation algorithm to find new interactions. Statistical analyses of 10-fold cross-validation results and experimental analysis support the effectiveness of the proposed algorithm.

Heter-LP:异质标记传播算法及其在药物复位中的应用

药物复位为药物发现提供了有效的解决方案,通过发现现有药物的新迹象症节省时间和资源。通常,药物通过它在细胞中的靶蛋白起作用。因此,药物开发研究有必要对药物、靶蛋白标和疾病的相互关系进行调查。虽然以前的研究已经为基于网络综合方法预测这些相互关系的有效性提供了有效的案例,但是在药物复位这方面的研究中取得的进展很少。此外,新药物和靶标(分别缺乏任何已知靶标和药物)的相互作用不能通过大多数已建立的方法准确预测。在本文中,我们提出了一种名为Heter-LP的新型半监督异质标记传播算法,其应用局部和全局网络特征来进行数据整合。为了预测药物靶点,疾病靶点和药物—疾病关联,我们使用从不同水平的多个来源收集的关于药物、疾病和靶标的信息。我们的算法将这些各种类型的数据集成到异构网络中,并实现标签传播算法来发现新的交互。10倍交叉验证结果和实验分析的统计分析支持所提出的算法的有效性。


Learning Bound for Parameter Transfer Learning

Wataru Kumagai

We consider a transfer-learning problem by using the parameter transfer approach, where a suitable parameter of feature mapping is learned through one task and applied to another objective task. Then, we introduce the notion of the local stability and parameter transfer learnability of parametric feature mapping,and thereby derive a learning bound for parameter transfer algorithms. As an application of parameter transfer learning, we discuss the performance of sparse coding in self-taught learning. Although self-taught learning algorithms with plentiful unlabeled data often show excellent empirical performance, their theoretical analysis has not been studied. In this paper, we also provide the first theoretical learning bound for self-taught learning.

参数迁移学习的学习界限

我们使用参数迁移方法来考虑转移学习问题,通过一个任务学习特征映射的合适参数并且应用于另一个目标任务。然后,我们介绍参数特征映射的参数迁移学习能力和局部稳定性的概念,从而导出参数迁移算法的学习界限。作为参数迁移学习的应用,我们讨论了稀疏编码在自学教学中的表现。虽然具有丰富未标记数据的自学学习算法经常表现出优异的实验性能,但是它们的理论分析还没有被研究。在本文中,我们还为自学学习提供了第一个理论学习界限。


Adversarial Ladder Networks

Juan Maroñas Molano, Alberto Albiol Colomer, Roberto Paredes Palacios

The use of unsupervised data in addition to supervised data in training discriminative neural networks has improved the performance of this classification scheme. However, the best results were achieved with a training process that is divided in two parts: first an unsupervised pre-training step is done for initializing the weights of the network and after these weights are refined with the use of supervised data. On the other hand adversarial noise has improved the results of classical supervised learning. Recently, a new neural network topology called Ladder Network, where the key idea is based in some properties of hierarchichal latent variable models, has been proposed as a technique to train a neural network using supervised and unsupervised data at the same time with what is called semi-supervised learning. This technique has reached state of the art classification. In this work we add adversarial noise to the ladder network and get state of the art classification, with several important conclusions on how adversarial noise can help in addition with new possible lines of investigation. We also propose an alternative to add adversarial noise to unsupervised data.

对抗性梯形网络

除了训练鉴别神经网络中的监督数据之外,无监督数据的使用也改进了这种分类方案的性能。然而,使用训练过程实现了最好的结果,该训练过程分为两部分:首先进行无监督的预训练步骤以初始化网络的权重,之后使用监督数据来修正这些权重。另一方面,对抗噪声改善了语义监督学习的结果。最近,提出了一种称为梯形网络的新的神经网络拓扑结构,其中关键点是基于分层隐变量模型的一些性质,已被提出作为一种使用监督和非监督数据同时训练神经网络的技术,称为半监督学习。这种技术已经达到了最先进的分类。在这次研究中中,我们将对抗噪声添加到梯形网络中,并获得最先进的分类,和几个重要的结论关于对抗噪声如何帮助除了新的可能的调查线。我们还提出了一种替代方法,将对抗噪声添加到非监督数据。


工具

零基础入门深度学习(1)-感知器

链接:零基础入门深度学习(1)-感知器

神经网络研究与应用中的新思路】

《The role of novel ideas in neural network research & applications》by Dieser Beitrag

链接:The role of novel ideas in neural network research & applications

【对抗训练新视角】

《New Perspectives on Adversarial Training (NIPS 2016 Adversarial Training Workshop)》by Ferenc Huszár

链接:New Perspectives on Adversarial Training

“GANs之父”Goodfellow 38分钟视频亲授:如何完善生成对抗网络?(上)

链接:如何完善生成对抗网络

《轻量级大规模机器学习算法库Fregata开源:快速,无需调参》

链接:轻量级大规模机器学习算法库Fregata开源:快速,无需调参


其他

推荐算法概览

在实践中,我们很可能需要测试多种算法,以便找出最适合用户的那种;了解这些算法的概念以及工作原理,对它们有个直观印象将会很有帮助。

链接:推荐算法概览

基于认知网络的资源分配控制技术研究

W. Thomas给出了认知网络的定义:认知网络(Cognitive Network)是具有认知过程,能感知当前网络条件,然后依据这些条件做出规划、决策和采取行动的网络。

链接:基于认知网络的资源分配控制技术研究

图像在战术数据链中的可靠传输协议研究

图像作为最直观的信息表现形式,在承载战场战术信息的应用中越发重要,如雷达成像、红外图像和可见光图像等等。

链接:图像在战术数据链中的可靠传输协议研究

Phone

07318457661

Address

National University of Defense Tecnology
Changsha, Hunan 410073
China