Biologically weighted LASSO: enhancing functional interpretability in gene expression data analysis

被引:2
作者
Mongardi, Sofia [1 ]
Cascianelli, Silvia [1 ]
Masseroli, Marco [1 ]
机构
[1] Politecn Milan, Dipartimento Elettron Informaz & Bioingn DEIB, Via Ponzio 34-5, I-20133 Milan, Italy
关键词
SELECTION; KNOWLEDGE; ONTOLOGY; TOOL;
D O I
10.1093/bioinformatics/btae605
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Feature selection approaches are widely used in gene expression data analysis to identify the most relevant features and boost performance in regression and classification tasks. However, such algorithms solely consider each feature's quantitative contribution to the task, possibly limiting the biological interpretability of the results. Feature-related prior knowledge, such as functional annotations and pathways information, can be incorporated into feature selection algorithms to potentially improve model performance and interpretability. Results: We propose an embedded integrative approach to feature selection that combines weighted LASSO feature selection and prior biological knowledge in a single step, by means of a novel score of biological relevance that summarizes information extracted from popular biological knowledge bases. Findings from the performed experiments indicate that our proposed approach is able to identify the most predictive genes while simultaneously enhancing the biological interpretability of the results compared to the standard LASSO regularized model.
引用
收藏
页数:8
相关论文
共 26 条
[1]   Unsupervised gene selection using biological knowledge : application in sample clustering [J].
Acharya, Sudipta ;
Saha, Sriparna ;
Nikhil, N. .
BMC BIOINFORMATICS, 2017, 18
[2]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[3]   USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING [J].
BATTITI, R .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (04) :537-550
[4]   Towards knowledge-based gene expression data mining [J].
Bellazzi, Riccardo ;
Zupan, Blaz .
JOURNAL OF BIOMEDICAL INFORMATICS, 2007, 40 (06) :787-802
[5]   Weighted Lasso with Data Integration [J].
Bergersen, Linn Cecilie ;
Glad, Ingrid K. ;
Lyng, Heidi .
STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2011, 10 (01)
[6]  
Biesiada J, 2007, ADV INTEL SOFT COMPU, V45, P242
[7]   Minimum redundancy feature selection from microarray gene expression data [J].
Ding, C ;
Peng, HC .
PROCEEDINGS OF THE 2003 IEEE BIOINFORMATICS CONFERENCE, 2003, :523-528
[8]  
Duda Richard O., 2006, Pattern classification
[9]   An integrative gene selection with association analysis for microarray data classification [J].
Fang, Ong Huey ;
Mustapha, Norwati ;
Sulaiman, Md. Nasir .
INTELLIGENT DATA ANALYSIS, 2014, 18 (04) :739-758
[10]   MULTICOLLINEARITY IN REGRESSION ANALYSIS - PROBLEM REVISITED [J].
FARRAR, DE ;
GLAUBER, RR .
REVIEW OF ECONOMICS AND STATISTICS, 1967, 49 (01) :92-107