Glycoinformatics in the Artificial Intelligence Era

被引:47
作者
Bojar, Daniel [4 ,5 ]
Lisacek, Frederique [1 ,2 ,3 ]
机构
[1] Swiss Inst Bioinformat, Proteome Informat Grp, CH-1227 Geneva, Switzerland
[2] Univ Geneva, Comp Sci Dept, CH-1227 Geneva, Switzerland
[3] Univ Geneva, Sect Biol, CH-1227, Geneva, Switzerland
[4] Univ Gothenburg, Dept Chem & Mol Biol, S-41390 Gothenburg, Sweden
[5] Univ Gothenburg, Wallenberg Ctr Mol & Translational Med, Gothenburg41390, Gothenburg, Sweden
基金
瑞士国家科学基金会;
关键词
LEARNING-BASED APPROACH; PROTEIN IDENTIFICATION; COMPUTATIONAL TOOLS; SECONDARY STRUCTURE; N-GLYCOSYLATION; NEURAL-NETWORKS; PREDICTION; GLYCOMICS; SEQUENCE; DATABASE;
D O I
10.1021/acs.chemrev.2c00110
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Artificial intelligence (AI) methods have been and are now being increasingly integrated in prediction software implemented in bioinformatics and its glycoscience branch known as glycoinformatics. AI techniques have evolved in the past decades, and their applications in glycoscience are not yet widespread. This limited use is partly explained by the peculiarities of glyco-data that are notoriously hard to produce and analyze. Nonetheless, as time goes, the accumulation of glycomics, glycoproteomics, and glycan-binding data has reached a point where even the most recent deep learning methods can provide predictors with good performance. We discuss the historical development of the application of various AI methods in the broader field of glycoinformatics. A particular focus is placed on shining a light on challenges in glyco-data handling, contextualized by lessons learnt from related disciplines. Ending on the discussion of state-of-the-art deep learning approaches in glycoinformatics, we also envision the future of glycoinformatics, including development that need to occur in order to truly unleash the capabilities of glycoscience in the systems biology era.
引用
收藏
页码:15971 / 15988
页数:18
相关论文
共 163 条
[1]   Chemical Approaches To Perturb, Profile, and Perceive Glycans [J].
Agard, Nicholas J. ;
Bertozzi, Carolyn R. .
ACCOUNTS OF CHEMICAL RESEARCH, 2009, 42 (06) :788-797
[2]   CarbArrayART: a new software tool for carbohydrate microarray data storage, processing, presentation, and reporting [J].
Akune, Yukie ;
Arpinar, Sena ;
Silva, Lisete M. ;
Palma, Angelina S. ;
Tajadura-Ortega, Virginia ;
Aoki-Kinoshita, Kiyoko F. ;
Ranzinger, Rene ;
Liu, Yan ;
Feizi, Ten .
GLYCOBIOLOGY, 2022, 32 (07) :552-555
[3]   Unified rational protein engineering with sequence-based deep representation learning [J].
Alley, Ethan C. ;
Khimulya, Grigory ;
Biswas, Surojit ;
AlQuraishi, Mohammed ;
Church, George M. .
NATURE METHODS, 2019, 16 (12) :1315-+
[4]   GlyConnect: Glycoproteomics Goes Visual, Interactive, and Analytical [J].
Alocci, Davide ;
Mariethoz, Julien ;
Gastaldello, Alessandra ;
Gasteiger, Elisabeth ;
Karlsson, Niclas G. ;
Kolarich, Daniel ;
Packer, Nicolle H. ;
Lisacek, Frederique .
JOURNAL OF PROTEOME RESEARCH, 2019, 18 (02) :664-677
[5]   GlyTouCan 1.0-The international glycan structure repository [J].
Aoki-Kinoshita, Kiyoko ;
Agravat, Sanjay ;
Aoki, Nobuyuki P. ;
Arpinar, Sena ;
Cummings, Richard D. ;
Fujita, Akihiro ;
Fujita, Noriaki ;
Hart, Gerald M. ;
Haslam, Stuart M. ;
Kawasaki, Toshisuke ;
Matsubara, Masaaki ;
Moreman, Kelley W. ;
Okuda, Shujiro ;
Pierce, Michael ;
Ranzinger, Rene ;
Shikanai, Toshihide ;
Shinmachi, Daisuke ;
Solovieva, Elena ;
Suzuki, Yoshinori ;
Tsuchiya, Shinichiro ;
Yamada, Issaku ;
York, William S. ;
Zaia, Joseph ;
Narimatsu, Hisashi .
NUCLEIC ACIDS RESEARCH, 2016, 44 (D1) :D1237-D1242
[6]   GlycoBioinformatics [J].
Aoki-Kinoshita, Kiyoko F. ;
Lisacek, Frederique ;
Karlsson, Niclas ;
Kolarich, Daniel ;
Packer, Nicolle H. .
BEILSTEIN JOURNAL OF ORGANIC CHEMISTRY, 2021, 17 :2726-2728
[7]   Stereoelectronic effects in stabilizing protein-N-glycan interactions revealed by experiment and machine learning [J].
Ardejani, Maziar S. ;
Noodleman, Louis ;
Powers, Evan T. ;
Kelly, Jeffery W. .
NATURE CHEMISTRY, 2021, 13 (05) :480-+
[8]   SignalP 5.0 improves signal peptide predictions using deep neural networks [J].
Armenteros, Jose Juan Almagro ;
Tsirigos, Konstantinos D. ;
Sonderby, Casper Kaae ;
Petersen, Thomas Nordahl ;
Winther, Ole ;
Brunak, Soren ;
von Heijne, Gunnar ;
Nielsen, Henrik .
NATURE BIOTECHNOLOGY, 2019, 37 (04) :420-+
[9]   The case for post-predictional modifications in the AlphaFold Protein Structure Database [J].
Bagdonas, Haroldas ;
Fogarty, Carl A. ;
Fadda, Elisa ;
Agirre, Jon .
NATURE STRUCTURAL & MOLECULAR BIOLOGY, 2021, 28 (11) :869-870
[10]   The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 1999, 27 (01) :49-54