Optimal Boolean lattice-based algorithms for the U-curve optimization problem

被引:5
作者
Reis, Marcelo S. [1 ,2 ]
Estrela, Gustavo [1 ,2 ,3 ]
Ferreira, Carlos Eduardo [3 ]
Barrera, Junior [2 ,3 ]
机构
[1] Inst Butantan, Lab Especial Ciclo Celular, Sao Paulo, Brazil
[2] Inst Butantan, Ctr Toxins Immune Response & Cell Signaling CeTIC, Sao Paulo, Brazil
[3] Univ Sao Paulo, Inst Matemat & Estat, Sao Paulo, Brazil
基金
巴西圣保罗研究基金会;
关键词
Boolean lattice; Branch and bound; Combinatorial optimization; Feature selection; Machine learning; U-curve problem; STATISTICAL PATTERN-RECOGNITION; FLOATING SEARCH METHODS; DESIGN;
D O I
10.1016/j.ins.2018.08.060
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The U-curve optimization problem is characterized by a decomposable in U-shaped curves cost function over the chains of a Boolean lattice. This problem can be applied to model the classical feature selection problem in Machine Learning. In this paper, we point out that the firstly proposed algorithm to tackle the U-curve problem, the RBM algorithm, is in fact suboptimal. We also present two new algorithms: UCS, which is actually optimal to tackle this problem; and UCSR, a variation of UCS that solves a special case of the U-curve problem and relies on a reduced, ordered binary decision diagram to control the search space. We provide results of two computational assays with these new algorithms: first, W-operator design for filtering of binary images; second, linear SVM design for classification of data sets from the UCI Machine Learning Repository. We show that, in these assays, UCS and UCSR outperformed an exhaustive search and also three widely used heuristics: the SFFS sequential selection, the BFS graph-based search, and the CHCGA genetic algorithm. Finally, we analyze the obtained results and point out improvements that might enhance the performance of these two novel algorithms. (C) 2018 Elsevier Inc. All rights reserved.
引用
收藏
页码:97 / 114
页数:18
相关论文
共 42 条
[1]   COMPARATIVE-ANALYSIS OF STATISTICAL PATTERN-RECOGNITION METHODS IN HIGH-DIMENSIONAL SETTINGS [J].
AEBERHARD, S ;
COOMANS, D ;
DEVEL, O .
PATTERN RECOGNITION, 1994, 27 (08) :1065-1077
[2]  
[Anonymous], 2009, ACM SIGKDD explorations newsletter, DOI 10.1145/1656274.1656278
[3]  
Barrera J., 2000, Fundamenta Informaticae, V41, P229
[4]   Constructing probabilistic genetic networks of Plasmodium falciparum from dynamical expression signals of the intraerythrocytic development cycle [J].
Barrera, Junior ;
Cesar, Roberto M., Jr. ;
Martins, David C., Jr. ;
Vencio, Ricardo Z. N. ;
Merino, Emilio F. ;
Yamamoto, Marcio M. ;
Leonardi, Florencia G. ;
Pereira, Carlos A. de B. ;
Del Portillo, Hernando A. .
METHODS OF MICROARRAY DATA ANALYSIS V, 2007, :11-+
[5]  
Boser B. E., 1992, Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, P144, DOI 10.1145/130385.130401
[6]  
BRYANT RE, 1986, IEEE T COMPUT, V35, P677, DOI 10.1109/TC.1986.1676819
[7]   A survey on feature selection methods [J].
Chandrashekar, Girish ;
Sahin, Ferat .
COMPUTERS & ELECTRICAL ENGINEERING, 2014, 40 (01) :16-28
[8]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[9]   The Opportunity challenge: A benchmark database for on-body sensor-based activity recognition [J].
Chavarriaga, Ricardo ;
Sagha, Hesam ;
Calatroni, Alberto ;
Digumarti, Sundara Tejaswi ;
Troester, Gerhard ;
Millan, Jose del R. ;
Roggen, Daniel .
PATTERN RECOGNITION LETTERS, 2013, 34 (15) :2033-2042
[10]  
Dua D., 2017, Uci machine learning repository