Multimodal representation learning for predicting molecule-disease relations

被引:14
作者
Wen, Jun [1 ,2 ]
Zhang, Xiang [1 ]
Rush, Everett
Panickan, Vidul A. [1 ,2 ]
Li, Xingyu [1 ]
Cai, Tianrun [2 ,4 ,6 ]
Zhou, Doudou [5 ]
Ho, Yuk-Lam [2 ]
Costa, Lauren [2 ]
Begoli, Edmon [3 ]
Hong, Chuan [2 ]
Gaziano, J. Michael [1 ,2 ,7 ]
Cho, Kelly [1 ,2 ,7 ]
Lu, Junwei [2 ,8 ]
Liao, Katherine P. [1 ,2 ,8 ]
Zitnik, Marinka [1 ,9 ,10 ]
Cai, Tianxi [1 ,2 ,4 ]
机构
[1] Harvard Med Sch, Dept Biomed Informat, Boston, MA 02115 USA
[2] VA Boston Healthcare Syst, MA02130, Boston, MA 02130 USA
[3] Oak Ridge Natl Lab, DOE, Oak Ridge, TN 37831 USA
[4] Mass Gen Brigham, Boston, MA 02130 USA
[5] Univ Calif, Dept Stat, Davis, CA 95616 USA
[6] Duke Univ, Dept Biostat & Bioinformat, Durham, NC 27708 USA
[7] Brigham & Womens Hosp, Boston, MA 02115 USA
[8] Harvard TH Chan Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
[9] Broad Inst & Harvard, Cambridge, MA 02142 USA
[10] Harvard Data Sci Initiat, Cambridge, MA 02138 USA
基金
美国国家卫生研究院;
关键词
SARS-COV-2;
D O I
10.1093/bioinformatics/btad085
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Predicting molecule-disease indications and side effects is important for drug development and pharmacovigilance. Comprehensively mining molecule-molecule, molecule-disease and disease-disease semantic dependencies can potentially improve prediction performance. Methods: We introduce a Multi-Modal REpresentation Mapping Approach to Predicting molecular-disease relations (M2REMAP) by incorporating clinical semantics learned from electronic health records (EHR) of 12.6 million patients. Specifically, M2REMAP first learns a multimodal molecule representation that synthesizes chemical property and clinical semantic information by mapping molecule chemicals via a deep neural network onto the clinical semantic embedding space shared by drugs, diseases and other common clinical concepts. To infer molecule-disease relations, M2REMAP combines multimodal molecule representation and disease semantic embedding to jointly infer indications and side effects. Results: We extensively evaluate M2REMAP on molecule indications, side effects and interactions. Results show that incorporating EHR embeddings improves performance significantly, for example, attaining an improvement over the baseline models by 23.6% in PRC-AUC on indications and 23.9% on side effects. Further, M2REMAP overcomes the limitation of existing methods and effectively predicts drugs for novel diseases and emerging pathogens.
引用
收藏
页数:9
相关论文
共 50 条
[11]  
Gilmer J, 2017, PR MACH LEARN RES, V70
[12]   Meta-analyses of Adverse Effects Data Derived from Randomised Controlled Trials as Compared to Observational Studies: Methodological Overview [J].
Golder, Su ;
Loke, Yoon K. ;
Bland, Martin .
PLOS MEDICINE, 2011, 8 (05)
[13]   Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules [J].
Gomez-Bombarelli, Rafael ;
Wei, Jennifer N. ;
Duvenaud, David ;
Hernandez-Lobato, Jose Miguel ;
Sanchez-Lengeling, Benjamin ;
Sheberla, Dennis ;
Aguilera-Iparraguirre, Jorge ;
Hirzel, Timothy D. ;
Adams, Ryan P. ;
Aspuru-Guzik, Alan .
ACS CENTRAL SCIENCE, 2018, 4 (02) :268-276
[14]   node2vec: Scalable Feature Learning for Networks [J].
Grover, Aditya ;
Leskovec, Jure .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :855-864
[15]   Clinical knowledge extraction via sparse embedding regression (KESER) with multi-center large scale electronic health record data [J].
Hong, Chuan ;
Rush, Everett ;
Liu, Molei ;
Zhou, Doudou ;
Sun, Jiehuan ;
Sonabend, Aaron ;
Castro, Victor M. ;
Schubert, Petra ;
Panickan, Vidul A. ;
Cai, Tianrun ;
Costa, Lauren ;
He, Zeling ;
Link, Nicholas ;
Hauser, Ronald ;
Gaziano, J. Michael ;
Murphy, Shawn N. ;
Ostrouchov, George ;
Ho, Yuk-Lam ;
Begoli, Edmon ;
Lu, Junwei ;
Cho, Kelly ;
Liao, Katherine P. ;
Cai, Tianxi .
NPJ DIGITAL MEDICINE, 2021, 4 (01)
[16]   Raloxifene as a treatment option for viral infections [J].
Hong, Subin ;
Chang, JuOae ;
Jeong, Kwiwan ;
Lee, Wonsik .
JOURNAL OF MICROBIOLOGY, 2021, 59 (02) :124-131
[17]   The Potential of Lonidamine in Combination with Chemotherapy and Physical Therapy in Cancer Treatment [J].
Huang, Yaxin ;
Sun, Guohui ;
Sun, Xiaodong ;
Li, Feifan ;
Zhao, Lijiao ;
Zhong, Rugang ;
Peng, Yongzhen .
CANCERS, 2020, 12 (11) :1-25
[18]   Predicting neurological Adverse Drug Reactions based on biological, chemical and phenotypic properties of drugs using machine learning models [J].
Jamal, Salma ;
Goyal, Sukriti ;
Shanker, Asheesh ;
Grover, Abhinav .
SCIENTIFIC REPORTS, 2017, 7
[19]   Anticancer activity of paroxetine in human colon cancer cells: Involvement of MET and ERBB3 [J].
Jang, Won-Jun ;
Jung, Sung Keun ;
Tam Thuy Lu Vo ;
Jeong, Chul-Ho .
JOURNAL OF CELLULAR AND MOLECULAR MEDICINE, 2019, 23 (02) :1106-1115
[20]   The SIDER database of drugs and side effects [J].
Kuhn, Michael ;
Letunic, Ivica ;
Jensen, Lars Juhl ;
Bork, Peer .
NUCLEIC ACIDS RESEARCH, 2016, 44 (D1) :D1075-D1079