Modern data mining tools in descriptive sensory analysis: A case study with a Random forest approach

被引:45
作者
Granitto, P. M.
Gasperi, F.
Biasioli, F.
Trainotti, E.
Furlanello, C.
机构
[1] Agrifood Qual Dept, IASMA Res Ctr, I-38010 San Michele all Adige, TN, Italy
[2] Ctr Ric Sci & Tecnol, ITC IRSTh, I-38050 Povo, TN, Italy
关键词
Random forest; discriminant analysis; cheese; sensory attributes; variable selection;
D O I
10.1016/j.foodqual.2006.11.001
中图分类号
TS2 [食品工业];
学科分类号
0832 ;
摘要
In this paper we introduce random forest (RF) as a new modeling technique in the field of sensory analysis. As a case study we apply RF to the predictive discrimination of six typical cheeses of the Trentino province (North Italy) from data obtained by quantitative descriptive analysis. The corresponding sensory profiling was carried out by eight trained assessors using a developed language containing 35 attributes. We compare RFs discrimination capabilities with linear discriminant analysis (LDA) and discriminant partial least square (dPLS). The RF models result more accurate, with smaller prediction errors than LDA and dPLS. RF also offers the possibility of graphically analyzing the developed models with multi-dimensional scaling plots based on an internal measure of similarity between samples. We compare these plots with similar ones derived from principal component analysis and LDA, finding that the same qualitative information can be extracted from all methods. The RF model also gives an estimation of the relative importance of each sensory attribute for the discriminant function. We couple this measure with an appropriate experimental setup in order to obtain an unbiased and stable method for variable selection. We favorably compare this method with sequential selection based on LDA models. (c) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:681 / 689
页数:9
相关论文
共 28 条
[1]   Selection bias in gene extraction on the basis of microarray gene-expression data [J].
Ambroise, C ;
McLachlan, GJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (10) :6562-6566
[2]  
[Anonymous], 1983, INTRO BOOTSTRAP
[3]   An empirical comparison of voting classification algorithms: Bagging, boosting, and variants [J].
Bauer, E ;
Kohavi, R .
MACHINE LEARNING, 1999, 36 (1-2) :105-139
[4]   SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]  
Freund Y, 1996, ICML
[8]   Entropy-based gene ranking without selection bias for the predictive classification of microarray data [J].
Furlanello, C ;
Serafini, M ;
Merler, S ;
Jurman, G .
BMC BIOINFORMATICS, 2003, 4 (1)
[9]   Judge selection for hard and semi-hard cheese sensory evaluation [J].
Gallerani, G ;
Gasperi, F ;
Monetti, A .
FOOD QUALITY AND PREFERENCE, 2000, 11 (06) :465-474
[10]  
Gasperi F., 2004, Scienza e Tecnica Lattiero-Casearia, V55, P345