Bayesian nonnegative matrix factorization in an incremental manner for data representation

被引:1
作者
Yang, Lijun [1 ]
Yan, Lulu [2 ]
Yang, Xiaohui [1 ]
Xin, Xin [1 ]
Xue, Liugen [1 ]
机构
[1] Henan Univ, Henan Engn Res Ctr Artificial Intelligence Theory, Kaifeng 475000, Peoples R China
[2] Sun Yat Sen Univ, Sch Math, Guangzhou 510080, Peoples R China
基金
中国国家自然科学基金;
关键词
Nonnegative matrix factorization; Incremental learning; Truncated Gaussian prior; Inverse space sparse representation based classification; Tumor recognition; FACE RECOGNITION; CLASSIFICATION; TUMOR; PREDICTION; SELECTION; ALGORITHM;
D O I
10.1007/s10489-022-03522-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nonnegative matrix factorization (NMF) is a novel paradigm for feature representation and dimensionality reduction. However, the performance of the NMF model is affected by two critical and challenging problems. One is that the original NMF does not consider the distribution information of data and parameters, resulting in inaccurate representations. The other is the high computational complexity in online processing. Bayesian approaches are proposed to address the former problem of NMF. However, most existing Bayesian-based NMF models utilize an exponential prior, which only guarantees the nonnegativity of parameters without fully considering the prior information of the parameters. Thus, a new Bayesian-based NMF model is constructed based on the Gaussian likelihood and a truncated Gaussian prior, called the truncated Gaussian-based NMF (TG-NMF) model, in which a truncated Gaussian prior can prevent overfitting while ensuring nonnegativity. Furthermore, Bayesian inference-based incremental learning is introduced to reduce the high computational complexity of TG-NMF; this model is called TG-INMF. We adopt variational Bayesian to estimate all parameters of TG-NMF and TG-INMF. Experiments on genetic data-based tumor recognition demonstrate that our models are competitive with other existing methods for classification problems.
引用
收藏
页码:9580 / 9597
页数:18
相关论文
共 54 条
[1]  
Ade R. R., 2013, Int. J. Data Min. Knowl. Manag. Process, V3, P119, DOI DOI 10.5121/IJDKP.2013.3408
[2]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[3]   An oracle inequality for quasi-Bayesian nonnegative matrix factorization [J].
Alquier P. ;
Guedj B. .
Mathematical Methods of Statistics, 2017, 26 (1) :55-67
[4]  
[Anonymous], 2013, ENGINEERING-LONDON, DOI DOI 10.4236/ENG.2013.55B016
[5]   MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia [J].
Armstrong, SA ;
Staunton, JE ;
Silverman, LB ;
Pieters, R ;
de Boer, ML ;
Minden, MD ;
Sallan, SE ;
Lander, ES ;
Golub, TR ;
Korsmeyer, SJ .
NATURE GENETICS, 2002, 30 (01) :41-47
[6]  
Artac M, 2002, INT C PATT RECOG, P781, DOI 10.1109/ICPR.2002.1048133
[7]   A Clustering Approach for Feature Selection in Microarray Data Classification Using Random forest [J].
Aydadenta, Husna ;
Adiwijaya .
JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2018, 14 (05) :1167-1175
[8]  
Bishop C.M., 2006, PATTERN RECOGNITION, DOI [DOI 10.18637/JSS.V017.B05, 10.1117/1.2819119]
[9]  
Boyd S., 2004, Convex optimization
[10]   The use of the area under the roc curve in the evaluation of machine learning algorithms [J].
Bradley, AP .
PATTERN RECOGNITION, 1997, 30 (07) :1145-1159