Quantitative methods of standardization in cluster analysis: finding groups in data

被引:9
作者
Nogueira, Andre Luiz [1 ]
Munita, Casimiro S. [2 ]
机构
[1] Fed Inst Sergipe, IFS SE, Rod Lourival Batista S-N, BR-49400000 Lagarto, SE, Brazil
[2] Nucl & Energy Res Inst, IPEN CNEN SP, Av Prof Lineu Prestes, BR-05508000 Sao Paulo, SP, Brazil
关键词
Archaeometry; Instrumental neutron activation analysis; Clustering analysis; Validation indexes; VALIDATION; PROVENANCE; INAA;
D O I
10.1007/s10967-020-07186-6
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
The aim of this paper is to evaluate the impact of three standardization methods (z-score, log(10) and improved min-max) in determining the number of clusters for a dataset of 146 archaeological ceramic fragments in which mass fractions of chemical elements were determined by INAA. The results showed a tendency towards clustering, which did not occur to the non-standardized data. The standardization methods indicated the presence of three groups within the database. Quality evaluation of these clusters, by means of internal validation indexes, showed that the best performance was obtained with the log(10) transformation. This transformation also performed well in the calculation of compactness, while the improved min-max showed better performance in terms of separability.
引用
收藏
页码:719 / 724
页数:6
相关论文
共 36 条
[1]   AN ARCHAEOMETRIC CONTRIBUTION TO THE CHARACTERIZATION OF RENAISSANCE MAIOLICA FROM URBINO AND A COMPARISON WITH COEVAL MAIOLICA FROM PESARO (THE MARCHES, CENTRAL ITALY) [J].
Antonelli, F. ;
Ermeti, A. L. ;
Lazzarini, L. ;
Verita, M. ;
Raffaelli, G. .
ARCHAEOMETRY, 2014, 56 (05) :784-804
[2]   An extensive comparative study of cluster validity indices [J].
Arbelaitz, Olatz ;
Gurrutxaga, Ibai ;
Muguerza, Javier ;
Perez, Jesus M. ;
Perona, Inigo .
PATTERN RECOGNITION, 2013, 46 (01) :243-256
[3]   THE USE OF TRANSFORMATIONS [J].
BARTLETT, MS .
BIOMETRICS, 1947, 3 (01) :39-52
[4]   Cluster validation techniques for genome expression data [J].
Bolshakova, N ;
Azuaje, F .
SIGNAL PROCESSING, 2003, 83 (04) :825-833
[5]   Model-based evaluation of clustering validation measures [J].
Brun, Marcel ;
Sima, Chao ;
Hua, Jianping ;
Lowey, James ;
Carroll, Brent ;
Suh, Edward ;
Dougherty, Edward R. .
PATTERN RECOGNITION, 2007, 40 (03) :807-824
[6]  
Calinski R, 1974, COMMUN STAT, V3, P1, DOI [DOI 10.1080/03610927408827101, 10.1080/03610927408827101]
[7]   Effect of Data Standardization on Chemical Clustering and Similarity Searching [J].
Chu, Chia-Wei ;
Holliday, John D. ;
Willett, Peter .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2009, 49 (02) :155-161
[8]  
Cross G R., 1982, IFAC Proceedings Volumes, V15, P315, DOI [DOI 10.1016/S1474-6670(17)63365-2, 10.1016/S1474-6670, DOI 10.1016/S1474-6670]
[9]   CLUSTER SEPARATION MEASURE [J].
DAVIES, DL ;
BOULDIN, DW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1979, 1 (02) :224-227
[10]  
Dunn J. C., 1973, Journal of Cybernetics, V3, P32, DOI 10.1080/01969727308546046