A Novel Ensemble Method for Imbalanced Data Learning-Bagging of Extrapolation-SMOTE SVM

One class classification (OCC) problem corresponds to a special case in machine learning area when only a proportion of positive samples and unlabeled ones are employed for learning tasks. Due to the similar characteristics of the learning dataset, we also recognize the positive and learning problems (PU) as equivalent to the OCC problems in this paper. At the same time, some other concepts such as anomaly detection, novelty detection, concept learning show great connection with the OCC when it comes to some specific task. Obviously, the OCC behaves significantly different from the traditional supervised learning paradigm in which the former neg-lects the prior distribution of classes and gets rid of the dependence on the negative concepts. According to the objectives of the OCC, two kinds of learning are proposed, namely inductive OCC and transductive OCC. The inductive OCC aims at receiving better performance on both the training dataset and the unknown while the identification of the unlabeled only in the training dataset is the core in the latter. Noticing that semi-supervised learning[5] also exploits the unlabeled, the OCC can be viewed as an extreme circumstance in semi-supervised learning with the lack of the negative.




National University of Defense Tecnology
Changsha, Hunan 410073