Research of Imbalanced Classification Based on Cascade Forest

被引:0
作者
Shi, Minghua [1 ]
Lin, Fangxin [1 ]
Qian, Ying [1 ]
Dou, Liang [1 ]
机构
[1] East China Normal Univ, Sch Comp Sci & Technol, Shanghai, Peoples R China
来源
PROCEEDINGS OF THE 2021 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC) | 2021年
基金
中国国家自然科学基金;
关键词
Cascade Forest; Deep Forest; Imbalanced Data; Binary Classification; Under Sampling;
D O I
10.1109/PIC53636.2021.9687091
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the rapid development of science, the quantity of data is increasing exponentially. And unprecedented opportunities are provided by machine learning and data mining. While data classification is commonly used as a primary data processing method, the diversity of data is also a great challenge. Among those, problems caused by class imbalance are attracting more attention, and there are also a number of strategies and improvement of original algorithms are proposed. Gcforest is a new integrated learning algorithm proposed by Professor Zhou Zhihua in 2017. It has the advantages of few super parameters, suitable for small-scale data sets and strong model expression ability. However, the algorithm does not optimize the unbalanced data classification. Inspired by the improvement of other ensemble learning algorithms for unbalanced data classification, this paper applies a variety of under sampling strategies to the cascaded forest of gcforest. Through experimental comparison, it has achieved better or similar performance than the current advanced learning algorithms for unbalanced data sets on a variety of typical unbalanced data sets.
引用
收藏
页码:29 / 33
页数:5
相关论文
共 11 条
  • [1] Bowles M., 2014, MACHINE LEARNING PYT
  • [2] Chen C., 2004, Using Random Forest to Learn Imbalanced Data
  • [3] Ferniindez A, 2009, INT DAT ENG AUT LEAR
  • [4] Lemaître G, 2017, J MACH LEARN RES, V18
  • [5] Lev V, 2019, ADAPTIVE WEIGHTED DE
  • [6] Lichman M, 2013, UCI MACHINE LEARNING
  • [7] Liu X. Y., 2006, INT C DAT MIN IEEE C
  • [8] Utkin Lev V., 2020, INFORMATICA, V44
  • [9] Yang G., 2018, BMC BIOINFOMMTICS, V19, pS5
  • [10] Yuan Z., 2019, 2019 IEEE 8 JOINT I