Support vector machines for predictive modeling in heterogeneous catalysis: A comprehensive introduction and overfitting investigation based on two real applications

被引:72
作者
Baumes, L. A. [1 ]
Serra, J. M. [1 ]
Serna, P. [1 ]
Corma, A. [1 ]
机构
[1] Univ Politecn Valencia, CSIC, Inst Tecnol Quim, Valencia 46022, Spain
来源
JOURNAL OF COMBINATORIAL CHEMISTRY | 2006年 / 8卷 / 04期
关键词
ARTIFICIAL NEURAL-NETWORKS; OXIDATIVE DEHYDROGENATION; METHANOL SYNTHESIS; OPTIMIZATION; DESIGN; LIBRARIES; DISCOVERY; ETHYLENE; PROPANE; SEARCH;
D O I
10.1021/cc050093m
中图分类号
O69 [应用化学];
学科分类号
081704 ;
摘要
This works provides an introduction to support vector machines (SVMs) for predictive modeling in heterogeneous catalysis, describing step by step the methodology with a highlighting of the points which make such technique an attractive approach. We first investigate linear SVMs, working in detail through a simple example based on experimental data derived from a study aiming at optimizing olefin epoxidation catalysts applying high-throughput experimentation. This case study has been chosen to underline SVM features in a visual manner because of the few catalytic variables investigated. It is shown how SVMs transform original data into another representation space of higher dimensionality. The concepts of Vapnik-Chervonenkis dimension and structural risk minimization are introduced. The SVM methodology is evaluated with a second catalytic application, that is, light paraffin isomerization. Finally, we discuss why SVMs is a strategic method, as compared to other machine learning techniques, such as neural networks or induction trees, and why emphasis is put on the problem of overfitting.
引用
收藏
页码:583 / 596
页数:14
相关论文
共 73 条
[41]   Prediction of the isoelectric point of an amino acid based on GA-PLS and SVMs [J].
Liu, HX ;
Zhang, RS ;
Yao, XJ ;
Liu, MC ;
Hu, ZD ;
Fan, BT .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (01) :161-167
[42]   Diagnosing breast cancer based on support vector machines [J].
Liu, HX ;
Zhang, RS ;
Luan, F ;
Yao, XJ ;
Liu, MC ;
Hu, ZD ;
Fan, BT .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2003, 43 (03) :900-907
[43]   A PRACTICAL BAYESIAN FRAMEWORK FOR BACKPROPAGATION NETWORKS [J].
MACKAY, DJC .
NEURAL COMPUTATION, 1992, 4 (03) :448-472
[44]  
MAIER WF, 2004, POLYM MAT SCI ENG, V90, P652
[45]   Statistical practice in high-throughput screening data analysis [J].
Malo, N ;
Hanley, JA ;
Cerquozzi, S ;
Pelletier, J ;
Nadon, R .
NATURE BIOTECHNOLOGY, 2006, 24 (02) :167-175
[46]   Neural networks in drug discovery: have they lived up to their promise? [J].
Manallack, DT ;
Livingstone, DJ .
EUROPEAN JOURNAL OF MEDICINAL CHEMISTRY, 1999, 34 (03) :195-208
[47]  
MCCORMICK GP, 1983, LINEAR PROGRAMMING T
[48]   Simultaneous optimization of preparation conditions and composition of the methanol synthesis catalyst by an all-encompassing calculation on an artificial neural network [J].
Omata, K ;
Watanabe, Y ;
Hashimoto, M ;
Umegaki, T ;
Yamada, M .
INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2004, 43 (13) :3282-3288
[49]   Effect of the Genetic Algorithm parameters on the optimisation of heterogeneous catalysts [J].
Pereira, SRM ;
Clerc, F ;
Farrusseng, D ;
van der Waal, JC ;
Maschmeyer, T ;
Mirodatos, C .
QSAR & COMBINATORIAL SCIENCE, 2005, 24 (01) :45-57
[50]  
Plutowski M., 1994, Advances in Neural Information Processing Systems 6, P391