Model population analysis in chemometrics

被引:40
作者
Deng, Bai-Chuan [1 ]
Yun, Yong-Huan [1 ]
Liang, Yi-Zeng [1 ]
机构
[1] Cent South Univ, Coll Chem & Chem Engn, Changsha 410083, Peoples R China
基金
中国国家自然科学基金;
关键词
Model population analysis; Chemometrics; Variable selection; Model evaluation; Outlier detection; Applicability domain; UNINFORMATIVE VARIABLE ELIMINATION; WAVELENGTH INTERVAL SELECTION; MULTIVARIATE CALIBRATION; APPLICABILITY DOMAIN; OUTLIER DETECTION; RANDOM FROG; STRATEGY; ENSEMBLE; CLASSIFICATION; PERSPECTIVE;
D O I
10.1016/j.chemolab.2015.08.018
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Model population analysis (MPA) is a general framework for designing new types of chemometrics algorithms that has attracted increasing interest in the chemometrics community in recent years. The goal of MPA is to extract statistical information from the model, towards better understanding of the chemical data. Two key elements of MPA are random sampling and statistical analysis. The core idea of MPA is quite universal with potential applications in the fields, such as chemoinformatics, biostatistics and bioinformatics. In this article, we review the development of MPA in chemometrics. We first present the key elements of MPA. Then, the application of MPA in chemometrics is discussed, such as variable selection, model evaluation, outlier detection, applicability domain definition and so on. Finally, the potential application areas of MPA in future research are prospected. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:166 / 176
页数:11
相关论文
共 60 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]  
[Anonymous], 1995, The Weighted Bootstrap
[3]  
[Anonymous], 2003, Leslie Pack Kaelbling, DOI DOI 10.1162/153244303322753616
[4]   OCCAM RAZOR [J].
BLUMER, A ;
EHRENFEUCHT, A ;
HAUSSLER, D ;
WARMUTH, MK .
INFORMATION PROCESSING LETTERS, 1987, 24 (06) :377-380
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]   Statistical modeling: The two cultures [J].
Breiman, L .
STATISTICAL SCIENCE, 2001, 16 (03) :199-215
[8]   A variable selection method based on uninformative variable elimination for multivariate calibration of near-infrared spectra [J].
Cai, Wensheng ;
Li, Yankun ;
Shao, Xueguang .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2008, 90 (02) :188-194
[9]   A New Strategy of Outlier Detection for QSAR/QSPR [J].
Cao, Dong-Sheng ;
Liang, Yi-Zeng ;
Xu, Qing-Song ;
Li, Hong-Dong ;
Chen, Xian .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2010, 31 (03) :592-602
[10]   Elimination of uninformative variables for multivariate calibration [J].
Centner, V ;
Massart, DL ;
deNoord, OE ;
deJong, S ;
Vandeginste, BM ;
Sterna, C .
ANALYTICAL CHEMISTRY, 1996, 68 (21) :3851-3858