Multi-label Text Categorization using Error-correcting Output Coding with Weighted Probability

被引:2
作者
Balamurugan, V [1 ]
Vedanarayanan, V. [1 ]
Nisha, A. Sahaya Anselin [1 ]
Narmadha, R. [1 ]
Amirthalakshmi, T. M. [2 ]
机构
[1] Sathyabama Inst Sci & Technol, Dept ECE, Oldmamallapuram Rd, Chennai, Tamil Nadu, India
[2] SRM Inst Technol, Dept Elect & Commun Engn, Chennai, Tamil Nadu, India
来源
INTERNATIONAL JOURNAL OF ENGINEERING | 2022年 / 35卷 / 08期
关键词
Text Categorization; Multi-label Classification; Multi-label Text Categorization; Error Correcting Output Coding; Posterior Probability; CLASSIFICATION;
D O I
10.5829/ije.2022.35.08b.08
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In several real-world categorization problems, labeled data is generally hard to acquire when there is a huge number of unlabeled data. Hence, it is very important to devise a novel approaches to solve these problems, thereby choosing the most valuable instances for labeling and creating a superior classifier. Several existing techniques are devised for the binary categorization issues, only a limited number of algorithms are designed for handling the multi-label cases. The multi-label classification problem turns out to be more complex when the sample belongs to multiple labels from the group of accessible classes. In World Wide Web, text data is generally present nowadays, and is an obvious example for such type of tasks. This paper develops a novel technique to perform the multi-label text categorization by modifying the Error-Correcting Output Coding (ECOC) approach. Here, a cluster of binary complimentary classifiers are employed to facilitate the ECOC more effective for the multi-class problems. In addition, a weighted posterior probability is computed to enhance the multi-label text classification performance more effectively. Moreover, the performance of the proposed ECOC with weighted probability is analyzed using the performance metrics, like precision, recall, and f-measure with maximal precision of 0.897, higher recall value of 0.896, and maximum f-measure of 0.895.
引用
收藏
页码:1516 / 1523
页数:8
相关论文
共 25 条
[1]   Impact of Stemming and Word Embedding on Deep Learning-Based Arabic Text Categorization [J].
Almuzaini, Huda Abdulrahman ;
Azmi, Aqil M. .
IEEE ACCESS, 2020, 8 :127913-127928
[2]   Traffic Sign Recognition Using Evolutionary Adaboost Detection and Forest-ECOC Classification [J].
Baro, Xavier ;
Escalera, Sergio ;
Vitria, Jordi ;
Pujol, Oriol ;
Radeva, Petia .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2009, 10 (01) :113-126
[3]   PDF text classification to leverage information extraction from publication reports [J].
Duy Duc An Bui ;
Del Fiol, Guilherme ;
Jonnalagadda, Siddhartha .
JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 61 :141-148
[4]   The design of variable-length coding matrix for improving error correcting output codes [J].
Feng, Kai-Jie ;
Liong, Sze-Teng ;
Liu, Kun-Hong .
INFORMATION SCIENCES, 2020, 534 (534) :192-217
[5]   Active learning with error-correcting output codes [J].
Gu, Shilin ;
Cai, Yang ;
Shan, Jincheng ;
Hou, Chenping .
NEUROCOMPUTING, 2019, 364 :182-191
[6]   Wafer map defect pattern classification based on convolutional neural network features and error-correcting output codes [J].
Jin, Cheng Hao ;
Kim, Hyun-Jin ;
Piao, Yongjun ;
Li, Meijing ;
Piao, Minghao .
JOURNAL OF INTELLIGENT MANUFACTURING, 2020, 31 (08) :1861-1875
[7]   MULTI-LABEL CLASSIFICATION USING ERROR CORRECTING OUTPUT CODES [J].
Kajdanowicz, Tomasz ;
Kazienko, Przemyslaw .
INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2012, 22 (04) :829-840
[8]  
Kimura K, 2016, INT C PATT RECOG, P438, DOI 10.1109/ICPR.2016.7899673
[9]   Dynamic ensemble selection for multi -class classification with one-class classifiers [J].
Krawczyk, Bartosz ;
Galar, Mikel ;
Wozniak, Michal ;
Bustince, Humberto ;
Herrera, Francisco .
PATTERN RECOGNITION, 2018, 83 :34-51
[10]   ImageNet Classification with Deep Convolutional Neural Networks [J].
Krizhevsky, Alex ;
Sutskever, Ilya ;
Hinton, Geoffrey E. .
COMMUNICATIONS OF THE ACM, 2017, 60 (06) :84-90