A Novel Classification Method: Neighborhood-Based Positive Unlabeled Learning Using Decision Tree (NPULUD)

被引:3
作者
Ghasemkhani, Bita [1 ]
Balbal, Kadriye Filiz [2 ]
Birant, Kokten Ulas [3 ,4 ]
Birant, Derya [4 ]
机构
[1] Dokuz Eylul Univ, Grad Sch Nat & Appl Sci, TR-35390 Izmir, Turkiye
[2] Dokuz Eylul Univ, Dept Comp Sci, TR-35390 Izmir, Turkiye
[3] Dokuz Eylul Univ, Informat Technol Res & Applicat Ctr DEBTAM, TR-35390 Izmir, Turkiye
[4] Dokuz Eylul Univ, Dept Comp Engn, TR-35390 Izmir, Turkiye
关键词
artificial intelligence; machine learning; classification; positive unlabeled learning; decision tree; entropy measure; k-nearest neighbors; supervised learning; ALGORITHM;
D O I
10.3390/e26050403
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
In a standard binary supervised classification task, the existence of both negative and positive samples in the training dataset are required to construct a classification model. However, this condition is not met in certain applications where only one class of samples is obtainable. To overcome this problem, a different classification method, which learns from positive and unlabeled (PU) data, must be incorporated. In this study, a novel method is presented: neighborhood-based positive unlabeled learning using decision tree (NPULUD). First, NPULUD uses the nearest neighborhood approach for the PU strategy and then employs a decision tree algorithm for the classification task by utilizing the entropy measure. Entropy played a pivotal role in assessing the level of uncertainty in the training dataset, as a decision tree was developed with the purpose of classification. Through experiments, we validated our method over 24 real-world datasets. The proposed method attained an average accuracy of 87.24%, while the traditional supervised learning approach obtained an average accuracy of 83.99% on the datasets. Additionally, it is also demonstrated that our method obtained a statistically notable enhancement (7.74%), with respect to state-of-the-art peers, on average.
引用
收藏
页数:21
相关论文
共 44 条
[31]   A survey on semi-supervised learning [J].
Van Engelen, Jesper E. ;
Hoos, Holger H. .
MACHINE LEARNING, 2020, 109 (02) :373-440
[32]   Mapping US Urban Extents from MODIS Data Using One-Class Classification Method [J].
Wan, Bo ;
Guo, Qinghua ;
Fang, Fang ;
Su, Yanjun ;
Wang, Run .
REMOTE SENSING, 2015, 7 (08) :10143-10163
[33]   PSoL: a positive sample only learning algorithm for finding non-coding RNA genes [J].
Wang, Chunlin ;
Ding, Chris ;
Meraz, Richard F. ;
Holbrook, Stephen R. .
BIOINFORMATICS, 2006, 22 (21) :2590-2596
[34]   Spatiotemporal Pattern of Invasive Pedicularis in the Bayinbuluke Land, China, during 2019-2021: An Analysis Based on PlanetScope and Sentinel-2 Data [J].
Wang, Wuhua ;
Tang, Jiakui ;
Zhang, Na ;
Wang, Yanjiao ;
Xu, Xuefeng ;
Zhang, Anan .
REMOTE SENSING, 2023, 15 (18)
[35]   Automated Detection Method to Extract Pedicularis Based on UAV Images [J].
Wang, Wuhua ;
Tang, Jiakui ;
Zhang, Na ;
Xu, Xuefeng ;
Zhang, Anan ;
Wang, Yanjiao .
DRONES, 2022, 6 (12)
[36]   POSITIVE UNLABELED LEARNING BY SEMI-SUPERVISED LEARNING [J].
Wang, Zhuowei ;
Jiang, Jing ;
Long, Guodong .
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, :2976-2980
[37]  
Witten IH, 2011, MOR KAUF D, P1
[38]   Weakly Supervised Learning with Positive and Unlabeled Data for Automatic Brain Tumor Segmentation [J].
Wolf, Daniel ;
Regnery, Sebastian ;
Tarnawski, Rafal ;
Bobek-Billewicz, Barbara ;
Polanska, Joanna ;
Goetz, Michael .
APPLIED SCIENCES-BASEL, 2022, 12 (21)
[39]   EmptyNN: A neural network based on positive and unlabeled learning to remove cell-free droplets and recover lost cells in scRNA-seq data [J].
Yan, Fangfang ;
Zhao, Zhongming ;
Simon, Lukas M. .
PATTERNS, 2021, 2 (08)
[40]   Ensemble Positive Unlabeled Learning for Disease Gene Identification [J].
Yang, Peng ;
Li, Xiaoli ;
Chua, Hon-Nian ;
Kwoh, Chee-Keong ;
Ng, See-Kiong .
PLOS ONE, 2014, 9 (05)