Aggressive Sampling for Multi-class to Binary Reduction with Applications to Text Classification

被引:0
|
作者
Joshi, Bikash [1 ]
Amini, Massih-Reza [1 ]
Partalas, Ioannis [2 ]
Iutzeler, Franck [3 ]
Maximov, Yury [4 ,5 ]
机构
[1] Univ Grenoble Alps, LIG, Grenoble, France
[2] Expedia EWE, Geneva, Switzerland
[3] Univ Grenoble Alps, LJK, Grenoble, France
[4] Los Alamos Natl Lab, Los Alamos, NM USA
[5] Skolkovo IST, Moscow, Russia
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We address the problem of multi-class classification in the case where the number of classes is very large. We propose a double sampling strategy on top of a multi-class to binary reduction strategy, which transforms the original multi-class problem into a binary classification problem over pairs of examples. The aim of the sampling strategy is to overcome the curse of long-tailed class distributions exhibited in majority of large-scale multi-class classification problems and to reduce the number of pairs of examples in the expanded data. We show that this strategy does not alter the consistency of the empirical risk minimization principle defined over the double sample reduction. Experiments are carried out on DMOZ and Wikipedia collections with 10,000 to 100,000 classes where we show the efficiency of the proposed approach in terms of training and prediction time, memory consumption, and predictive performance with respect to state-of-the-art approaches.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Binary and multi-class classification of Android applications using static features
    Dhalaria, Meghna
    Gandotra, Ekta
    INTERNATIONAL JOURNAL OF APPLIED MANAGEMENT SCIENCE, 2023, 15 (02) : 117 - 140
  • [2] Reduction Stumps for Multi-class Classification
    Mohr, Felix
    Wever, Marcel
    Huellermeier, Eyke
    ADVANCES IN INTELLIGENT DATA ANALYSIS XVII, IDA 2018, 2018, 11191 : 225 - 237
  • [3] Binary classification trees for multi-class classification problems
    Lee, JS
    Oh, LS
    SEVENTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2003, : 770 - 774
  • [4] Boosting with Adaptive Sampling for Multi-class Classification
    Chen, Jianhua
    2015 IEEE 14TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2015, : 667 - 672
  • [5] Binary and Multi-Class Malware Threads Classification
    Ahmed, Ismail Taha
    Jamil, Norziana
    Din, Marina Md.
    Hammad, Baraa Tareq
    APPLIED SCIENCES-BASEL, 2022, 12 (24):
  • [6] Boost Multi-class sLDA Model for Text Classification
    Jankowski, Maciej
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2018, PT I, 2018, 10841 : 633 - 644
  • [7] Binary Stochastic Representations for Large Multi-class Classification
    Gerald, Thomas
    Baskiotis, Nicolas
    Denoyer, Ludovic
    NEURAL INFORMATION PROCESSING, ICONIP 2017, PT I, 2017, 10634 : 155 - 165
  • [8] Enhancing directed binary trees for multi-class classification
    Montanes, Elena
    Barranquero, Jose
    Diez, Jorge
    Jose del Coz, Juan
    INFORMATION SCIENCES, 2013, 223 : 42 - 55
  • [9] MULTI-CLASS LEAST SQUARES CLASSIFICATION AT BINARY-CLASSIFICATION COMPLEXITY
    Noumir, Zineb
    Honeine, Paul
    Richard, Cedric
    2011 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP (SSP), 2011, : 277 - 280
  • [10] Study on Multi-class Text Classification Based on Improved SVM
    Li, Qiong
    Chen, Li
    PRACTICAL APPLICATIONS OF INTELLIGENT SYSTEMS, ISKE 2013, 2014, 279 : 519 - 526