FsNet: Feature Selection Network on High-dimensional Biological Data

被引:14
作者
Singh, Dinesh [1 ]
Climente-Gonzalez, Hector [2 ]
Petrovich, Mathis [3 ]
Kawakami, Eiryo [4 ]
Yamada, Makoto [2 ,5 ,6 ]
机构
[1] Indian Inst Technol Mandi, Mandi, Himachal Prades, India
[2] RIKEN AIP, Tokyo, Japan
[3] Ecole Ponts ParisTech, Paris, France
[4] RIKEN Japan, Tokyo, Japan
[5] Kyoto Univ, Kyoto, Japan
[6] OIST Japan, Okinawa, Japan
来源
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN | 2023年
关键词
Feature Selection; Deep Neural Network; High-dimensional Data;
D O I
10.1109/IJCNN54540.2023.10191985
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Biological data, including gene expression data, are generally high-dimensional and require efficient, generalizable, and scalable machine-learning methods to discover complex nonlinear patterns. Recent advances in machine learning can be attributed to deep neural networks (DNNs), which perform various tasks in terms of computer vision and natural language processing. However, standard DNNs are inappropriate for high-dimensional datasets generated in biology because they consider numerous parameters, which in turn require numerous samples. In this paper, we propose a DNN-based, nonlinear feature selection method, called the feature selection network (FsNet), for high-dimensional and small sample data. Specifically, FsNet comprises a selection layer that selects features and a reconstruction layer that stabilizes the training. Because a large number of parameters in the selection and reconstruction layers can easily result in overfitting under a limited number of samples, we utilized two tiny networks to predict the large virtual weight matrices of the selection and reconstruction layers. Experimental results on several real-world high-dimensional biological datasets demonstrate the efficacy of the proposed method.
引用
收藏
页数:9
相关论文
共 29 条
[1]  
[Anonymous], 2013, ICASSP
[2]  
Balasubramanian K., 2013, AISTATS
[3]  
Balin M. F., 2019, ICML
[4]  
Chen FH, 2007, PRINCIPLES OF TISSUE ENGINEERING, 3RD EDITION, P823, DOI 10.1016/B978-012370615-7/50059-7
[5]  
Chen WL, 2015, PR MACH LEARN RES, V37, P2285
[6]   Block HSIC Lasso: model-free biomarker detection for ultra-high dimensional data [J].
Climente-Gonzalez, Hector ;
Azencott, Chloe-Agathe ;
Kaski, Samuel ;
Yamada, Makoto .
BIOINFORMATICS, 2019, 35 (14) :I427-I435
[7]   Sure independence screening for ultrahigh dimensional feature space [J].
Fan, Jianqing ;
Lv, Jinchi .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2008, 70 :849-883
[8]  
Guyon I., 2003, J MACH LEARN RES, V3, P1157, DOI DOI 10.1162/153244303322753616
[9]  
Li Yixue, 2014, Genomics Proteomics & Bioinformatics, V12, P187, DOI 10.1016/j.gpb.2014.10.001
[10]  
Liao S., 2019, IJCAI