Adaptive active learning through k-nearest neighbor optimized local density clustering

被引:2
作者
Ji, Xia [1 ]
Ye, WanLi [1 ]
Li, XueJun [1 ]
Zhao, Peng [1 ]
Yao, Sheng [1 ]
机构
[1] Anhui Univ, Sch Comp Sci & Technol, Hefei, Peoples R China
基金
安徽省自然科学基金;
关键词
Active learning; K-nearest neighbor; Density peak clustering; Adaptive instance selection; CLASSIFICATION;
D O I
10.1007/s10489-022-04169-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Active learning iteratively constructs a refined training set to train an effective classifier with as few labeled instances as possible. In areas where labeling is expensive, active learning plays an important and irreplaceable role. The main challenge of active learning is to correctly identify critical samples. One of the current mainstream methods is to mine the potential data structure based on clustering and then identify key instances. However, the existing methods all adopt deterministic strategies, and the number of key samples is only related to the number of samples to be classified. The internal structure information of the sample clusters to be classified is not used. After analysis and verification, this deterministic key sample selection strategy has serious label waste. This is a serious problem that urgently needs to be solved in active learning. To this end, we propose an adaptive active learning algorithm based on density clustering (AAKC). Firstly, we introduce k-nearest neighbor information to redefine the local density of the instance. The new sample density can clearly express the local structural information of the sample. Secondly, we developed an adaptive key instance selection strategy based on the k-nearest neighbor sample density, which can adaptively select the necessary number of instance queries according to the structural information of the instance clusters to be classified, avoiding label waste. The experimental results of comparison with other algorithms show that our algorithm uses fewer labels to achieve better classification accuracy and has excellent stability.
引用
收藏
页码:14892 / 14902
页数:11
相关论文
共 31 条
[1]   Presenting a new multiclass classifier based on learning automata [J].
Afshar, Sorour ;
Mosleh, Mohammad ;
Kheyrandish, Mohammad .
NEUROCOMPUTING, 2013, 104 :97-104
[2]   Multiclass Corporate Failure Prediction by Adaboost.M1 [J].
Alfaro Cortes, Esteban ;
Gamez Martinez, Matias ;
Garcia Rubio, Noelia .
INTERNATIONAL ADVANCES IN ECONOMIC RESEARCH, 2007, 13 (03) :301-312
[3]  
Blake C. L., 1998, UCI repository of machine learning databases
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]   Manifold Adaptive Experimental Design for Text Categorization [J].
Cai, Deng ;
He, Xiaofei .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (04) :707-719
[6]   Using LogitBoost classifier to predict protein structural classes [J].
Cai, YD ;
Feng, KY ;
Lu, WC ;
Chou, KC .
JOURNAL OF THEORETICAL BIOLOGY, 2006, 238 (01) :172-176
[7]   Hyperspectral Image Classification With Convolutional Neural Network and Active Learning [J].
Cao, Xiangyong ;
Yao, Jing ;
Xu, Zongben ;
Meng, Deyu .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (07) :4604-4616
[8]  
Dasgupta Sanjoy, 2008, ICML, P208
[9]   Active multi-kernel domain adaptation for hyperspectral image classification [J].
Deng, Cheng ;
Liu, Xianglong ;
Li, Chao ;
Tao, Dacheng .
PATTERN RECOGNITION, 2018, 77 :306-315
[10]  
Gilad-Bachrach R., 2003, LEIBNIZ CENT HEBR U, V88, P2004