Awesome Papers: 2016-09-1

Distributed Bayesian Learning with Stochastic Natural-gradient Expectation Propagation and the Posterior Server

Leonard Hasenclever, Stefan Webb, Thibaut Lienart, Yee Whye Teh, Sebastian Vollmer, Balaji Lakshminarayanan, Charles Blundell (Submitted on 31 Dec 2015 (v1), last revised 1 Sep 2016 (this version, v2)) This paper makes two contributions to Bayesian machine learning algorithms. Firstly, we propose stochastic natural gradient expectation propagation (SNEP), a novel alternative to expectation propagation (EP), a popular variational inference algorithm. SNEP is a black box variational algorithm, in that it does not require any simplifying assumptions on the distribution of interest, beyond the existence of some Monte Carlo sampler for estimating the moments of the EP tilted distributions. Further, as opposed to EP which has no guarantee of convergence, SNEP can be shown to be convergent, even when using Monte Carlo moment estimates. Secondly, we propose a novel architecture for distributed Bayesian learning which we call the posterior server. The posterior server allows scalable and robust Bayesian learning in cases where a dataset is stored in a distributed manner across a cluster, with each compute node containing a disjoint subset of data. An independent Monte Carlo sampler is run on each compute node, with direct access only to the local data subset, but which targets an approximation to the global posterior distribution given all data across the whole cluster. This is achieved by using a distributed asynchronous implementation of SNEP to pass messages across the cluster. We demonstrate SNEP and the posterior server on distributed Bayesian learning of logistic regression and neural networks.

随机自然梯度期望传播和后服务器的分布式贝叶斯学习

本文对贝叶斯机器学习算法做了两个创新。首先,我们提出了随机的自然梯度期望传播(SNEP),一个可替代期望传播(EP)的新方法,一种流行的变分推理算法。SNEP是一个黑盒子变分算法,因为它在利益分配上不需要任何的简化假设,除了一些蒙特卡罗抽样估计的极压倾斜分布的时刻的存在。此外,相对于EP没有保证收敛,即使使用Monte Carlo时刻估计,SNEP也可以被证明是收敛的。其次,我们提出了一种新的架构,分布式贝叶斯学习,我们称之为后验服务器。后服务器允许可扩展性和强大的贝叶斯学习的情况下,一个数据集以一个分布式的方式存储在一个集群,每个计算节点包含一个不相交的数据子集。一个独立的蒙特卡洛采样运行在每个计算节点上,并且只对本地数据的子集直接访问,但它的目标是一个近似给出整个集群所有数据的全局后验分布。这是通过使用一个分布式的异步SNEP在集群上传递消息来实现的。我们展示了SNEP和分布式贝叶斯Logistic回归以及神经网络学习上的后服务器。


A novel progressive learning technique for multi-class classification

Authors: Rajasekar Venkatesan, Meng Joo Er (Submitted on 1 Sep 2016) In this paper, a progressive learning technique for multi-class classification is proposed. This newly developed learning technique is independent of the number of class constraints and it can learn new classes while still retaining the knowledge of previous classes. Whenever a new class (non-native to the knowledge learnt thus far) is encountered, the neural network structure gets remodeled automatically by facilitating new neurons and interconnections, and the parameters are calculated in such a way that it retains the knowledge learnt thus far. This technique is suitable for real-world applications where the number of classes is often unknown and online learning from real-time data is required. The consistency and the complexity of the progressive learning technique are analyzed. Several standard datasets are used to evaluate the performance of the developed technique. A comparative study shows that the developed technique is superior.

一种新的多类分类的渐进式学习技术

本文提出了一种多类分类的渐进式学习技术。这种新开发的学习技术是独立的类约束的数量,它可以学习新的类同时还保留了以前的类的知识。每当遇到一个新的类(到目前为止所学到的知识的非母语)时,通过促进新神经元和相互连接神经网络结构自动得到改造,并以这样一种参数计算方式,它保留了迄今为止所学的知识。这种技术适用于现实世界的应用程序的,因为类的数量往往是未知的而且需要实时数据的在线学习。我们分析了这种渐进式学习技术的一致性和复杂性。几个标准的数据集是用于评估这种先进技术的性能。比较研究表明,这种先进技术是品质优良的的。


Employing traditional machine learning algorithms for big data streams analysis: the case of object trajectory prediction

Angelos Valsamis, Konstantinos Tserpes, Dimitrios Zissis, Dimosthenis Anagnostopoulos, Theodora Varvarigou (Submitted on 1 Sep 2016) In this paper, we model the trajectory of sea vessels and provide a service that predicts in near-real time the position of any given vessel in 4’, 10’, 20’ and 40’ time intervals. We explore the necessary tradeoffs between accuracy, performance and resource utilization are explored given the large volume and update rates of input data. We start with building models based on well-established machine learning algorithms using static datasets and multi-scan training approaches and identify the best candidate to be used in implementing a single-pass predictive approach, under real-time constraints. The results are measured in terms of accuracy and performance and are compared against the baseline kinematic equations. Results show that it is possible to efficiently model the trajectory of multiple vessels using a single model, which is trained and evaluated using an adequately large, static dataset, thus achieving a significant gain in terms of resource usage while not compromising accuracy.

用传统机器学习算法来分析大数据流——以目标轨迹预测为例

在本文中,我们模拟的海上船只的轨迹并提供一个服务,预测在近实时情况下任何给定船只在4’,10’,20’和40’的时间间隔下的位置。我们在精度和性能之间进行了必要的权衡,并探讨大量的输入数据和更新率的资源利用率。我们开始在已建好的机器学习算法基础上建造模型,使用静态数据集和多扫描训练方法,来确定在实时约束情况下被用来实施一个单一通过预测方法的最佳候选者。结果是根据测量的准确性和性能来得到的,并和基线运动学方程进行比较。结果表明,使用一个单一的模型来有效地模仿多个船只的轨迹是可能的,可以通过使用一个足够大的静态数据集来训练和评估,根据资源的使用情况而不是妥协的准确性,从而实现了显着的增益。


A Unified View of Multi-Label Performance Measures

Xi-Zhu Wu, Zhi-Hua Zhou (Submitted on 1 Sep 2016) Multi-label classification deals with the problem where each instance is associated with multiple class labels. Because evaluation in multi-label classification is more complicated than single-label setting, a number of performance measures have been proposed. It is noticed that an algorithm usually performs differently on different measures. Therefore, it is important to understand which algorithms perform well on which measure(s) and why. In this paper, we propose a unified margin view to revisit eleven performance measures in multi-label classification. In particular, we define label-wise margin and instance-wise margin, and prove that through maximizing these margins, different corresponding performance measures will be optimized. Based on the defined margins, a max-margin approach called LIMO is designed and empirical results verify our theoretical findings.

多标签性能措施的统一视图

多标签分类处理的问题其中每个实例都与多个类标签有关。由于多标签分类的评价比单标签设置更为复杂,大家已经提出了一些性能措施。我们注意到,一个算法在不同的措施经常会有不同的表现。因此,了解哪些算法在哪种措施下执行的最好以及为什么就变得尤为重要。在本文中,我们提出了一个统一的边缘视图,重新审视在多标签分类下的十一个性能表现。特别是,我们定义了明智标签的范围和明智实例的范围,并证明,通过最大限度地提高这些范围,不同的相应性能措施将被优化。基于定义的空间,接近最大范围的被称为LIMO,并且实证结果验证了我们的理论结果。


Importance weighting without importance weights: An efficient algorithm for combinatorial semi-bandits

Gergely Neu, Gábor Bartók We propose a sample-efficient alternative for importance weighting for situations where one only has sample access to the probability distribution that generates the observations. Our new method, called Geometric Resampling (GR), is described and analyzed in the context of online combinatorial optimization under semi-bandit feedback, where a learner sequentially selects its actions from a combinatorial decision set so as to minimize its cumulative loss. In particular, we show that the well-known Follow-the-Perturbed-Leader (FPL) prediction method coupled with Geometric Resampling yields the first computationally efficient reduction from offline to online optimization in this setting. We provide a thorough theoretical analysis for the resulting algorithm, showing that its performance is on par with previous, inefficient solutions. Our main contribution is showing that, despite the relatively large variance induced by the GR procedure, our performance guarantees hold with high probability rather than only in expectation. As a side result, we also improve the best known regret bounds for FPL in online combinatorial optimization with full feedback, closing the perceived performance gap between FPL and exponential weights in this sett。

无重要性权重的重要性权重:一种组合半土匪的有效算法

在只有唯一一个产生观察资料访概率分布的样本情况下,我们提出了一个重要权重的有效替代样本。我们的新方法,称为几何重采样(GR),描述和分析了在半匪反馈背景下的在线组合优化,一个学习者可以依次选择从组合决策集选择,以尽量减少其累积损失的行为。特别是我们表明,知名的后续扰动领袖(FPL)预测法结合了几何重采样率,在这种设置下是第一个有效减少从线下到线上优化的。我们对产生的算法提供了一个深入的理论分析,它的性能与之前的低效解决方案不相上下。我们的主要贡献是表明,尽管由GR程序引起比较大的差异,我们的性能保证有很高的概率,而不是仅仅的期望值。作为一个侧面结果,我们也提高了FPL在最知名的遗憾边界全反馈条件下的在线组合优化,关闭了FPL和指数权重在这个经纬下的感知性能缺口。


Business Process Deviance Mining: Review and Evaluation

Hoang Nguyen, Marlon Dumas, Marcello La Rosa, Fabrizio Maria Maggi, Suriadi Suriadi (Submitted on 29 Aug 2016) Business process deviance refers to the phenomenon whereby a subset of the executions of a business process deviate, in a negative or positive way, with respect to its expected or desirable outcomes. Deviant executions of a business process include those that violate compliance rules, or executions that undershoot or exceed performance targets. Deviance mining is concerned with uncovering the reasons for deviant executions by analyzing business process event logs. This article provides a systematic review and comparative evaluation of deviance mining approaches based on a family of data mining techniques known as sequence classification. Using real-life logs from multiple domains, we evaluate a range of feature types and classification methods in terms of their ability to accurately discriminate between normal and deviant executions of a process. We also analyze the interestingness of the rule sets extracted using different methods. We observe that feature sets extracted using pattern mining techniques only slightly outperform simpler feature sets based on counts of individual activity occurrences in a trace.

业务流程异常行为挖掘:回顾与评价

业务流程的异常行为指的是一个业务流程异常执行的一个子集的现象,以积极或者消极的方式,相对于其预期或期望的结果。异常执行业务流程包括违反遵守规则、或执行下冲或超过业绩目标。异常行为挖掘关注通过分析业务流程的事件日志来揭露异常执行的原因。本文提供了一个系统回顾和基于被称为序列分类的家庭数据异常挖掘方法的比较评价。使用从多个领域的现实生活中日志,我们评估了一系列的特征类型和分类方法,根据他们能准确地区分过程中正常和异常执行行为的能力。我们还采用不同的提取方法分析了一套规则的兴趣度。我们观察到,使用模式挖掘技术提取的特征集仅略优于基于追溯单个活动事件的简单特征集。


Modelling Cyber-Security Experts’ Decision Making Processes using Aggregation Operators

Simon Miller, Christian Wagner, Uwe Aickelin, Jonathan M. Garibaldi (Submitted on 30 Aug 2016) An important role carried out by cyber-security experts is the assessment of proposed computer systems, during their design stage. This task is fraught with difficulties and uncertainty, making the knowledge provided by human experts essential for successful assessment. Today, the increasing number of progressively complex systems has led to an urgent need to produce tools that support the expert-led process of system-security assessment. In this research, we use weighted averages (WAs) and ordered weighted averages (OWAs) with evolutionary algorithms (EAs) to create aggregation operators that model parts of the assessment process. We show how individual overall ratings for security components can be produced from ratings of their characteristics, and how these individual overall ratings can be aggregated to produce overall rankings of potential attacks on a system. As well as the identification of salient attacks and weak points in a prospective system, the proposed method also highlights which factors and security components contribute most to a component’s difficulty and attack ranking respectively. A real world scenario is used in which experts were asked to rank a set of technical attacks, and to answer a series of questions about the security components that are the subject of the attacks. The work shows how finding good aggregation operators, and identifying important components and factors of a cyber-security problem can be automated. The resulting operators have the potential for use as decision aids for systems designers and cyber-security experts, increasing the amount of assessment that can be achieved with the limited resources available.

使用聚合运算模型化网络安全专家的决策制定过程

由网络安全专家所扮演的一个重要角色是在他们的设计阶段,所提议的计算机系统的评估。这项任务充满了困难和不确定性,使由人类专家提供的知识成为成功评估的必需品。今天,越来越多日益复杂的系统已经迫切导致需要产生一个能够支持专家主导过程的系统安全评估工具。在本次研究中,我们采用加权平均(WAs)和有序加权平均(OWAS)以及进化算法(EAS)来创建模仿部分评估过程的聚合算法。我们展示了个人对安全组件的全面评级是如何从他们的特征评级来产生的,以及这些个人的整体评级是如何汇总,以产生潜在攻击系统的整体排名。以及在一个前瞻性的系统中显著攻击和弱点的识别,所提出的方法也突出了哪些因素和安全组件是对分别对组件难度和攻击排名贡献最大的。在真实世界的情况下专家被要求使用一组排序的技术攻击,并回答了一系列关于安全组件的问题,是攻击的主题。研究显示如何找到良好的聚合运算,并确定一个网络安全问题的重要组成部分和因素可以是自动化的。由此产生的算法有可能作为系统设计师和网络安全专家的决策辅助者,评估数量的增加可以通过有限的可用资源来实现。


Multi-Label Classification Method Based on Extreme Learning Machines

Rajasekar Venkatesan, Meng Joo Er (Submitted on 30 Aug 2016) In this paper, an Extreme Learning Machine (ELM) based technique for Multi-label classification problems is proposed and discussed. In multi-label classification, each of the input data samples belongs to one or more than one class labels. The traditional binary and multi-class classification problems are the subset of the multi-label problem with the number of labels corresponding to each sample limited to one. The proposed ELM based multi-label classification technique is evaluated with six different benchmark multi-label datasets from different domains such as multimedia, text and biology. A detailed comparison of the results is made by comparing the proposed method with the results from nine state of the arts techniques for five different evaluation metrics. The nine methods are chosen from different categories of multi-label methods. The comparative results shows that the proposed Extreme Learning Machine based multi-label classification technique is a better alternative than the existing state of the art methods for multi-label problems.

基于极限学习机器的多标签分类方法

在本文中,提出并讨论了一个基于多标签分类问题的技术的极限学习机器(ELM)。在多标签分类中,每个输入数据样本都属于一个或多个类标签。传统的二进制和多类分类问题是多标签问题的子集,对应于每一个样本的标签数量限制在一个。所提议的基于ELM的多标签分类技术从不同领域例如多媒体、文学和生物学的六个不同基准的多标签数据集进行评估。通过比较所提出的方法与九个国家的五个不同的评价指标得到了详细的比较结果。这九种方法是从不同类别的多标签方法中选择出来的。比较结果表明,所提出的基于极限学习机器的多标签分类技术对多标签问题来说是比已存在的最先进的艺术方法更好的替代品。


工具

Mastering Software Development in R

掌握使用R的软件开发

链接:Mastering Software Development in R

深入深度学习四步走(相关资源介绍)

《4 Steps for Learning Deep Learning - A handy list of resources to help you become a deep learning expert》by Vivek Kumar

链接:4 Steps for Learning Deep Learning

基于OCR任务的机器学习简单实践

《The Simple + Practical Path to Machine Learning Capability: A Common Benchmark Task》by Dan Kuster

链接:《The Simple + Practical Path to Machine Learning Capability: A Common Benchmark Task》

Phone

07318457661

Address

National University of Defense Tecnology
Changsha, Hunan 410073
China