A comparative study of point-to-point algorithms for matching spectra

被引:54
作者
Li, Jianfeng
Hibbert, D. Brynn [1 ]
Fuller, Stephen
Vaughn, Gary
机构
[1] Univ New S Wales, Sch Chem, Sydney, NSW, Australia
[2] Dept Environm & Conservat, Environm Forens & Analyt Sci, Lidcombe, NSW, Australia
关键词
matching spectra; correlation coefficient; euclidean cosine; similarity index;
D O I
10.1016/j.chemolab.2005.05.015
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Matching spectra is necessary for database searches, assessing the source of an unknown sample, structure elucidation, and classification of spectra. A direct method of matching is to compare, point by point, two digitized spectra, the outcome being a parameter that quantifies the degree of similarity or dissimilarity between the spectra. Examples studied here are correlation coefficient squared and Euclidean cosine squared, both applied to the raw spectra and first-difference values of absorbance. It is shown that spectra do not fulfill the requirements for a normal statistical interpretation of the correlation coefficient; in particular, they are not normally distributed variables. It is therefore not correct to use a Student's t-test to calculate the probability of the null hypothesis that two spectra are not correlated on the basis of a correlation coefficient between them. We have investigated the effect on the similarity indices of systematically changing the mean and standard deviation of a single Gaussian peak relative to a reference Gaussian peak, of changing one peak, and of changing many peaks, in a simulated 10-peak spectrum. Squared Euclidean cosine is least sensitive to changes and the first-difference methods are most sensitive to changes in mean and standard deviation of peaks. A shift of the center of a peak has a greater effect on the indices than increases in peak width, but a decrease in peak width does lead to significant changes in the indices. We recommend that if these indices are to be used to match spectra, appropriate windows should be chosen to avoid dilution by regions with no significant change. (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:50 / 58
页数:9
相关论文
共 22 条
  • [1] Anderson TW., 1984, INTRO MULTIVARIATE S
  • [2] *ASTM, 1998, 341498 ASTM D
  • [3] BEHR A, 2004, ENG LIFE SCI, P415
  • [4] Bellamy L.J., 1975, ADV INFRARED GROUP F
  • [5] CHAU FT, 2001, ANAL SCI S, V17, pA419
  • [6] Searching a mid-infrared spectral library of solids and liquids with spectra of mixtures
    Chen, CS
    Li, Y
    Brown, CW
    [J]. VIBRATIONAL SPECTROSCOPY, 1997, 14 (01) : 9 - 17
  • [7] APPLICATION OF THE MAXIMAL COMMON SUBSTRUCTURE ALGORITHM TO AUTOMATIC INTERPRETATION OF C-13-NMR SPECTRA
    CHEN, LG
    ROBIEN, W
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1994, 34 (04): : 934 - 941
  • [8] Fisher R.A., 1970, STAT METHODS RES WOR
  • [9] Cluster analysis of gallstone FT-IR spectra: Tests on simulated mixture spectra and comparison between spectral and morphological classification of human gallstones
    Laloum, E
    Dao, NQ
    Daudon, M
    [J]. APPLIED SPECTROSCOPY, 1998, 52 (09) : 1210 - 1221
  • [10] Spectral pattern recognition using self-organizing MAPS
    Lavine, BK
    Davidson, CE
    Westover, DJ
    [J]. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (03): : 1056 - 1064