Awesome Papers: 2017-01-2

Why and When Can Deep – but Not Shallow – Networks Avoid the Curse of Dimensionality: a Review

Tomaso Poggio, Hrushikesh Mhaskar, Lorenzo Rosasco, Brando Miranda, Qianli Liao

The paper reviews and extends an emerging body of theoretical results on deep learning including the conditions under which it can be exponentially better than shallow learning.A class of deep convolutional networks represent an important special case of these conditions, though weight sharing is not the main reason for their exponential advantage. Implications of a few key theorems are discussed, together with new results, open problems and conjectures.


Time Series Classification from Scratch with Deep Neural Networks: A Strong Baseline

Zhiguang Wang, Weizhong Yan, Tim Oates

We propose a simple but strong baseline for time series classification from scratch with deep neural networks. Our proposed baseline models are pure end-to-end without any heavy preprocessing on the raw data or feature crafting. The FCN achieves premium performance to other state-of-the-art approaches. Our exploration of the very deep neural networks with the ResNet structure achieves competitive performance under the same simple experiment settings. The simple MLP baseline is also comparable to the 1NN-DTW as a previous golden baseline. Our models provides a simple choice for the real world application and a good starting point for the future research. An overall analysis is provided to discuss the generalization of our models, learned features, network structures and the classification semantics.


Memory Lens: How Much Memory Does an Agent Use?

Christoph Dann, Katja Hofmann, Sebastian Nowozin

We propose a new method to study the internal memory used by reinforcement learning policies. We estimate the amount of relevant past information by estimating mutual information between behavior histories and the current action of an agent. We perform this estimation in the passive setting, that is, we do not intervene but merely observe the natural behavior of the agent. Moreover,we provide a theoretical justification for our approach by showing that it yields an implementation-independent lower bound on the minimal memory capacity of any agent that implement the observed policy. We demonstrate our approach by estimating the use of memory of DQN policies on concatenated Atari frames,demonstrating sharply different use of memory across 49 games. The study of memory as information that flows from the past to the current action opens avenues to understand and improve successful reinforcement learning algorithms.


Rein Houthooft, Xi Chen, Yan Duan, John Schulman, Filip De Turck, Pieter Abbeel

Scalable and effective exploration remains a key challenge in reinforcement learning (RL). While there are methods with optimality guarantees in the setting of discrete state and action spaces, these methods cannot be applied in high-dimensional deep RL scenarios. As such, most contemporary RL relies on simple heuristics such as epsilon-greedy exploration or adding Gaussian noise to the controls. This paper introduces Variational Information Maximizing Exploration (VIME), an exploration strategy based on maximization of information gain about the agent’s belief of environment dynamics. We propose a practical implementation, using variational inference in Bayesian neural networks which efficiently handles continuous state and action spaces. VIME modifies the MDP reward function, and can be applied with several different underlying RL algorithms. We demonstrate that VIME achieves significantly better performance compared to heuristic exploration methods across a variety of continuous control tasks and algorithms, including tasks with very sparse rewards.

对应代码:https://github.com/openai/vime


Generative Adversarial Imitation Learning

Jonathan Ho, Stefano Ermon

Consider learning a policy from example expert behavior, without interaction with the expert or access to reinforcement signal. One approach is to recover the expert’s cost function with inverse reinforcement learning, then extract a policy from that cost function with reinforcement learning. This approach is indirect and can be slow. We propose a new general framework for directly extracting a policy from data, as if it were obtained by reinforcement learning following inverse reinforcement learning. We show that a certain instantiation of our framework draws an analogy between imitation learning and generative adversarial networks, from which we derive a model-free imitation learning algorithm that obtains significant performance gains over existing model-free methods in imitating complex behaviors in large, high-dimensional environments.

对应代码:https://github.com/openai/imitation


TF.Learn: TensorFlow’s High-level Module for Distributed Machine Learning

Yuan Tang

TF.Learn is a high-level Python module for distributed machine learning inside TensorFlow. It provides an easy-to-use Scikit-learn style interface to simplify the process of creating, configuring, training, evaluating, and experimenting a machine learning model. TF.Learn integrates a wide range of state-of-art machine learning algorithms built on top of TensorFlow’s low level APIs for small to large-scale supervised and unsupervised problems. This module focuses on bringing machine learning to non-specialists using a general-purpose high-level language as well as researchers who want to implement, benchmark,and compare their new methods in a structured environment. Emphasis is put on ease of use, performance,documentation, and API consistency.


End-to-End Deep Reinforcement Learning for Lane Keeping Assist

Ahmad El Sallab, Mohammed Abdou, Etienne Perot and Senthil Yogamani

Reinforcement learning is considered to be a strong AI paradigm which can be used to teach machines through interaction with the environment and learning from their mistakes, but it has not yet been successfully used for automotive applications. There has recently been a revival of interest in the topic,however, driven by the ability of deep learning algorithms to learn good representations of the environment. Motivated by Google DeepMind’s successful demonstrations of learning for games from Breakout to Go, we will propose different methods for autonomous driving using deep reinforcement learning.This is of particular interest as it is difficult to pose autonomous driving as a supervised learning problem as it has a strong interaction with the environment including other vehicles, pedestrians and roadworks. As this is a relatively new area of research for autonomous driving, we will formulate two main categories of algorithms: 1) Discrete actions category, and 2) Continuous actions category. For the discrete actions category, we will deal with Deep Q-Network Algorithm (DQN) while for the continuous actions category, we will deal with Deep Deterministic Actor Critic Algorithm (DDAC). In addition to that, We will also discover the performance of these two categories on an open source car simulator for Racing called (TORCS) which stands for The Open Racing car Simulator. Our simulation results demonstrate learning of autonomous maneuvering in a scenario of complex road curvatures and simple interaction with other vehicles. Finally, we explain the effect of some restricted conditions, put on the car during the learning phase, on the convergence time for finishing its learning phase.


Neuro-symbolic representation learning on biological knowledge graphs

Mona Alshahrani, Mohammed Asif Khan, Omar Maddouri, Akira R Kinjo, N'uria Queralt-Rosinach, Robert Hoehndorf

Motivation: Biological data and knowledge bases increasingly rely on Semantic Web technologies and the use of knowledge graphs for data integration,retrieval and federated queries. In the past years, feature learning methods that are applicable to graph-structured data are becoming available, but have not yet widely been applied and evaluated on structured biological knowledge.Results: We develop a novel method for feature learning on biological knowledge graphs. Our method combines symbolic methods, in particular knowledge representation using symbolic logic and automated reasoning, with neural networks to generate embeddings of nodes that encode for related information within knowledge graphs. Through the use of symbolic logic, these embeddings contain both explicit and implicit information. We apply these embeddings to the prediction of edges in the knowledge graph representing problems of function prediction, finding candidate genes of diseases, protein-protein interactions, or drug target relations, and demonstrate performance that matches and sometimes outperforms traditional approaches based on manually crafted features. Our method can be applied to any biological knowledge graph,and will thereby open up the increasing amount of Semantic Web based knowledge bases in biology to use in machine learning and data analytics.


DeepCancer: Detecting Cancer through Gene Expressions via Deep Generative Learnin

Rajendra Rana Bhat, Vivek Viswanath, Xiaolin Li

Transcriptional profiling on microarrays to obtain gene expressions has been used to facilitate cancer diagnosis. We propose a deep generative machine learning architecture (called DeepCancer) that learn features from unlabeled microarray data. These models have been used in conjunction with conventional classifiers that perform classification of the tissue samples as either being cancerous or non-cancerous. The proposed model has been tested on two different clinical datasets. The evaluation demonstrates that DeepCancer model achieves a very high precision score, while significantly controlling the false positive and false negative scores.


Reinforcement Learning With Temporal Logic Rewards

Xiao Li, Cristian-Ioan Vasile and Calin Belta

The reward function plays a critical role in rein- forcement learning (RL).It is a place where designers specify the desired behavior and impose important constraints for the system. While most reward functions used in the current RL literature have been quite simple for relatively simple tasks, real world applications typically involve tasks that are logically more complex. In this paper we take advantage of the expressive power of temporal logic (TL) to express complex rules the system should follow, or incorporate domain knowledge into learning. We propose a TL that we argue is suitable for robotic task specification in RL. We also present the concept of one-step robustness, and show that the problem of sparse reward that occurs when using TL specification as the reward function can be alleviated while leaving the resultant policy invariant. Simulated manipulation tasks are used to illustrate RL with the proposed TL and the one-step robustness.


Learning to Drive using Inverse Reinforcement Learning and Deep Q-Networks

Sahand Sharifzadeh, Ioannis Chiotellis, Rudolph Triebel, Daniel Cremers

We propose an inverse reinforcement learning (IRL) approach using Deep Q-Networks to extract the rewards in problems with large state spaces. We evaluate the performance of this approach in a simulation-based autonomous driving scenario. Our results resemble the intuitive relation between the reward function and readings of distance sensors mounted at different poses on the car. We also show that, after a few learning rounds, our simulated agent generates collision-free motions and performs human-like lane change behaviour.

Phone

07318457661

Address

National University of Defense Tecnology
Changsha, Hunan 410073
China