Global and Local (Glocal) Bagging Approach for Classifying Noisy Dataset
Received:October 15, 2008  Revised:December 20, 2008  Download PDF
Peng Zhang,Zhiwang Zhang,Aihua Li,Yong Shi. Global and Local (Glocal) Bagging Approach for Classifying Noisy Dataset. International Journal of Software and Informatics, 2008,2(2):181~197
Hits: 4055
Download times: 2798
Peng Zhang  Zhiwang Zhang  Aihua Li  Yong Shi
Fund:This work is supported by a grant from National Natural Science Foundation of China (#70621001,#70531040, #70501030, #10601064, #70472074), National Natural Science Foundation of Beijing #9073020, 973 Project #2004CB720103, Ministry of Science and Technol
Abstract:Learning from noisy data is a challenging task for data mining research. In this paper, we argue that for noisy data both global bagging strategy and local bagging strategy su er from their own inherent disadvantages and thus cannot form accurate prediction models. Consequently, we present a Global and Local Bagging (called Glocal Bagging:GB) approach to tackle this problem. GB assigns weight values to the base classi ers under the consideration that: (1) for each test instance Ix, GB prefers bags close to Ix, which is the nature of the local learning strategy; (2) for base classi ers, GB assigns larger weight values to the ones with higher accuracy on the out-of-bag, which is the nature of the global learning strategy. Combining (1) and (2), GB assign large weight values to the classi ers which are close to the current test instance Ix and have high out-of-bag accuracy. The diversity/accuracy analysis on synthetic datasets shows that GB improves the classi er ensemble's performance by increasing its base classi er's accuracy. Moreover, the bias/variance analysis also shows that GB's accuracy improvement mainly comes from the reduction of the bias error. Experiment results on 25 UCI benchmark datasets show that when the datasets are noisy, GB is superior to other former proposed bagging methods such as the classical bagging, bragging, nice bagging, trimmed bagging and lazy bagging.
keywords:bagging  ensemble learning  sampling
View Full Text  View/Add Comment  Download reader



Top Paper  |  FAQ  |  Guest Editors  |  Email Alert  |  Links  |  Copyright  |  Contact Us

© Copyright by Institute of Software, the Chinese Academy of Sciences

京公网安备 11040202500065号