SIMILARITY SEARCHING AND CLUSTERING OF CHEMICAL-STRUCTURE DATABASES USING MOLECULAR PROPERTY DATA

被引:121
|
作者
DOWNS, GM
WILLETT, P
FISANICK, W
机构
[1] UNIV SHEFFIELD,DEPT INFORMAT STUDIES,SHEFFIELD S10 2TN,S YORKSHIRE,ENGLAND
[2] CHEM ABSTRACTS SERV INC,COLUMBUS,OH 43210
来源
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES | 1994年 / 34卷 / 05期
关键词
D O I
10.1021/ci00021a011
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Previous work on the clustering of chemical-structure databases has focused on the use of intermolecular similarity measures that are based on structural features of various kinds. In this paper, we report nearest-neighbor searching and clustering experiments with a set of 5982 molecules, each of which is characterized by 13 calculated global molecular properties. The nearest-neighbor algorithm is an upperbound procedure that uses the triangle inequality to minimize the number of distance calculations that need to be carried out when searching for nearest neighbors in metric spaces. Our experiments suggest that it performs well when small numbers of nearest neighbors are required, but that the basic ''brute-force'' procedure is best when large numbers are needed, such as when clustering is to be carried out. The clustering methods tested are the Ward and group-average hierarchic agglomerative methods, the minimum-diameter polythetic hierarchic divisive method, and the Jarvis-Patrick nearest-neighbor method. Our experiments suggest that the first three methods, which gave similar results, are the best methods for clustering molecules characterized by property data. The Jarvis-Patrick method, which has been extensively used for clustering molecules characterized by structural fragments, was not as effective as these other methods.
引用
收藏
页码:1094 / 1102
页数:9
相关论文
共 50 条
  • [21] Mixed text and structure searching of chemical databases.
    Delany, J
    Bradshaw, J
    Ford, M
    Lipkin, M
    Lippi, F
    Salt, D
    Sayle, R
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1999, 217 : U558 - U558
  • [23] An Improvable Structure for Similarity Searching in Metric Spaces: Application on Image databases
    Hanyf, Y.
    Silkan, H.
    Labani, H.
    2016 13TH INTERNATIONAL CONFERENCE ON COMPUTER GRAPHICS, IMAGING AND VISUALIZATION (CGIV), 2016, : 67 - 72
  • [24] STRUCTURE SEARCHING IN CHEMICAL DATABASES BY DIRECT LOOKUP METHODS
    CHRISTIE, BD
    LELAND, BA
    NOURSE, JG
    JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1993, 33 (04): : 545 - 547
  • [25] Searching molecular structure databases with tandem mass spectra using CSI:FingerID
    Duehrkop, Kai
    Shen, Huibin
    Meusel, Marvin
    Rousu, Juho
    Boecker, Sebastian
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2015, 112 (41) : 12580 - 12585
  • [26] LATEST DEVELOPMENT IN ONLINE CHEMICAL-STRUCTURE SEARCH-SYSTEMS - DRAWING MOLECULES ON THE COMPUTER-SCREEN AND SEARCHING COMPUTER DATABASES
    SAXEN, R
    KEMISK TIDSKRIFT, 1987, 99 (13): : 16 - 17
  • [27] CHEMICAL-STRUCTURE ANALYSIS FROM SPECTRAL DATA USING PATTERN-RECOGNITION TECHNIQUES AND A MOLECULAR-STRUCTURE GENERATOR
    LIDDELL, RW
    JURS, PC
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1975, 170 (AUG24): : 58 - 58
  • [28] SEARCHING AND CLUSTERING OF DATABASES USING THE ICL DISTRIBUTED ARRAY PROCESSOR
    POGUE, CA
    RASMUSSEN, EM
    WILLETT, P
    PARALLEL COMPUTING, 1988, 8 (1-3) : 399 - 407
  • [29] SOME HEURISTICS FOR NEAREST-NEIGHBOR SEARCHING IN CHEMICAL-STRUCTURE FILES
    WILLETT, P
    JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1983, 23 (01): : 22 - 25
  • [30] SELECTION OF DESCRIPTORS ACCORDING TO DISCRIMINATION AND REDUNDANCY - APPLICATION TO CHEMICAL-STRUCTURE SEARCHING
    HODES, L
    JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1976, 16 (02): : 88 - 93