Comprehensive comparison of in silico MS/MS fragmentation tools of the CASMI contest: database boosting is needed to achieve 93% accuracy

被引:84
作者
Blazenovic, Ivana [1 ,2 ,3 ]
Kind, Tobias [3 ]
Torbasinovic, Hrvoje [4 ]
Obrenovic, Slobodan [4 ]
Mehta, Sajjan S. [3 ]
Tsugawa, Hiroshi [5 ]
Wermuth, Tobias [3 ]
Schauer, Nicolas [2 ]
Jahn, Martina [1 ]
Biedendieck, Rebekka [1 ]
Jahn, Dieter [1 ]
Fiehn, Oliver [3 ,6 ]
机构
[1] Tech Univ Braunschweig, Inst Microbiol, Brunswick, ME USA
[2] Metabol Discoveries GmbH, Potsdam, Germany
[3] UC Davis Genome Ctr, NIH West Coast Metabol Ctr, Room 1313,451 Hlth Sci Dr, Davis, CA 95616 USA
[4] Inovatus Ltd, Zagreb, Croatia
[5] RIKEN, Ctr Sustainable Resource Sci, Yokohama, Kanagawa, Japan
[6] King Abdulaziz Univ, Dept Biochem, Fac Sci, Jeddah, Saudi Arabia
基金
美国国家科学基金会;
关键词
Compound identification; Mass spectrometry; Structure elucidation; In silico fragmentation; MS/MS; Metabolomics; MOLECULAR-STRUCTURE DATABASES; METABOLITE IDENTIFICATION; MASS-SPECTRA; ELUCIDATION; ANNOTATION; CHEMISTRY; PRODUCTS;
D O I
10.1186/s13321-017-0219-x
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In mass spectrometry-based untargeted metabolomics, rarely more than 30% of the compounds are identified. Without the true identity of these molecules it is impossible to draw conclusions about the biological mechanisms, pathway relationships and provenance of compounds. The only way at present to address this discrepancy is to use in silico fragmentation software to identify unknown compounds by comparing and ranking theoretical MS/MS fragmentations from target structures to experimental tandem mass spectra (MS/MS). We compared the performance of four publicly available in silico fragmentation algorithms (MetFragCL, CFM-ID, MAGMa+ and MS-FINDER) that participated in the 2016 CASMI challenge. We found that optimizing the use of metadata, weighting factors and the manner of combining different tools eventually defined the ultimate outcomes of each method. We comprehensively analysed how outcomes of different tools could be combined and reached a final success rate of 93% for the training data, and 87% for the challenge data, using a combination of MAGMa+, CFM-ID and compound importance information along with MS/MS matching. Matching MS/MS spectra against the MS/MS libraries without using any in silico tool yielded 60% correct hits, showing that the use of in silico methods is still important.
引用
收藏
页数:12
相关论文
共 30 条
[1]   Integration of Molecular Networking and In-Silico MS/MS Fragmentation for Natural Products Dereplication [J].
Allard, Pierre-Marie ;
Peresse, Tiphaine ;
Bisson, Jonathan ;
Gindro, Katia ;
Marcourt, Laurence ;
Van Cuong Pham ;
Roussi, Fanny ;
Litaudon, Marc ;
Wolfender, Jean-Luc .
ANALYTICAL CHEMISTRY, 2016, 88 (06) :3317-3323
[2]   Competitive fragmentation modeling of ESI-MS/MS spectra for putative metabolite identification [J].
Allen, Felicity ;
Greiner, Russ ;
Wishart, David .
METABOLOMICS, 2015, 11 (01) :98-110
[3]   Searching molecular structure databases using tandem MS data: are we there yet? [J].
Boecker, Sebastian .
CURRENT OPINION IN CHEMICAL BIOLOGY, 2017, 36 :1-6
[4]   Fast metabolite identification with Input Output Kernel Regression [J].
Brouard, Celine ;
Shen, Huibin ;
Duehrkop, Kai ;
d'Alche-Buc, Florence ;
Boecker, Sebastian ;
Rousu, Juho .
BIOINFORMATICS, 2016, 32 (12) :28-36
[5]   Fragmentation reactions using electrospray ionization mass spectrometry: an important tool for the structural elucidation and characterization of synthetic and natural products [J].
Demarque, Daniel P. ;
Crotti, Antonio E. M. ;
Vessecchi, Ricardo ;
Lopes, Joao L. C. ;
Lopes, Norberto P. .
NATURAL PRODUCT REPORTS, 2016, 33 (03) :432-455
[6]   Searching molecular structure databases with tandem mass spectra using CSI:FingerID [J].
Duehrkop, Kai ;
Shen, Huibin ;
Meusel, Marvin ;
Rousu, Juho ;
Boecker, Sebastian .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2015, 112 (41) :12580-12585
[7]   Towards First Principles Calculation of Electron Impact Mass Spectra of Molecules [J].
Grimme, Stefan .
ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2013, 52 (24) :6306-6312
[8]   MassBank: a public repository for sharing mass spectral data for life sciences [J].
Horai, Hisayuki ;
Arita, Masanori ;
Kanaya, Shigehiko ;
Nihei, Yoshito ;
Ikeda, Tasuku ;
Suwa, Kazuhiro ;
Ojima, Yuya ;
Tanaka, Kenichi ;
Tanaka, Satoshi ;
Aoshima, Ken ;
Oda, Yoshiya ;
Kakazu, Yuji ;
Kusano, Miyako ;
Tohge, Takayuki ;
Matsuda, Fumio ;
Sawada, Yuji ;
Hirai, Masami Yokota ;
Nakanishi, Hiroki ;
Ikeda, Kazutaka ;
Akimoto, Naoshige ;
Maoka, Takashi ;
Takahashi, Hiroki ;
Ara, Takeshi ;
Sakurai, Nozomu ;
Suzuki, Hideyuki ;
Shibata, Daisuke ;
Neumann, Steffen ;
Iida, Takashi ;
Tanaka, Ken ;
Funatsu, Kimito ;
Matsuura, Fumito ;
Soga, Tomoyoshi ;
Taguchi, Ryo ;
Saito, Kazuki ;
Nishioka, Takaaki .
JOURNAL OF MASS SPECTROMETRY, 2010, 45 (07) :703-714
[9]   Risk management of emerging compounds and pathogens in the water cycle (RiSKWa) [J].
Huckele S. ;
Track T. .
Huckele, S. (huckele@dechema.de), 1600, Springer Verlag (25)
[10]   MINING MOLECULAR STRUCTURE DATABASES: IDENTIFICATION OF SMALL MOLECULES BASED ON FRAGMENTATION MASS SPECTROMETRY DATA [J].
Hufsky, Franziska ;
Boecker, Sebastian .
MASS SPECTROMETRY REVIEWS, 2017, 36 (05) :624-633