From language models to large-scale food and biomedical knowledge graphs

被引:5
作者
Cenikj, Gjorgjina [1 ,2 ]
Strojnik, Lidija [1 ]
Angelski, Risto [3 ]
Ogrinc, Nives [1 ]
Seljak, Barbara Korousic [1 ]
Eftimov, Tome [1 ]
机构
[1] Jozef Stefan Inst, Ljubljana 1000, Slovenia
[2] Jozef Stefan Int Postgrad Sch, Ljubljana 1000, Slovenia
[3] Clin Doctor 24 Hours, Ljubljana 1000, Slovenia
关键词
DISEASE; TEXT;
D O I
10.1038/s41598-023-34981-4
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Knowledge about the interactions between dietary and biomedical factors is scattered throughout uncountable research articles in an unstructured form (e.g., text, images, etc.) and requires automatic structuring so that it can be provided to medical professionals in a suitable format. Various biomedical knowledge graphs exist, however, they require further extension with relations between food and biomedical entities. In this study, we evaluate the performance of three state-of-the-art relation-mining pipelines (FooDis, FoodChem and ChemDis) which extract relations between food, chemical and disease entities from textual data. We perform two case studies, where relations were automatically extracted by the pipelines and validated by domain experts. The results show that the pipelines can extract relations with an average precision around 70%, making new discoveries available to domain experts with reduced human effort, since the domain experts should only evaluate the results, instead of finding, and reading all new scientific papers.
引用
收藏
页数:14
相关论文
共 59 条
[1]   Health effects of dietary risks in 195 countries, 1990-2017: a systematic analysis for the Global Burden of Disease Study 2017 [J].
Afshin, Ashkan ;
Sur, Patrick John ;
Fay, Kairsten A. ;
Cornaby, Leslie ;
Ferrara, Giannina ;
Salama, Joseph S. ;
Mullany, Erin C. ;
Abate, Kalkidan Hassen ;
Abbafati, Cristiana ;
Abebe, Zegeye ;
Afarideh, Mohsen ;
Aggarwal, Anju ;
Agrawal, Sutapa ;
Akinyemiju, Tomi ;
Alahdab, Fares ;
Bacha, Umar ;
Bachman, Victoria F. ;
Badali, Hamid ;
Badawi, Alaa ;
Bensenor, Isabela M. ;
Bernabe, Eduardo ;
Biryukov, Stan H. ;
Biadgilign, Sibhatu Kassa K. ;
Cahill, Leah E. ;
Carrero, Juan J. ;
Cercy, Kelly M. ;
Dandona, Lalit ;
Dandona, Rakhi ;
Anh Kim Dang ;
Degefa, Meaza Girma ;
Zaki, Maysaa El Sayed ;
Esteghamati, Alireza ;
Esteghamati, Sadaf ;
Fanzo, Jessica ;
Farinha, Carla Sofia E. Sa ;
Farvid, Maryam S. ;
Farzadfar, Farshad ;
Feigin, Valery L. ;
Fernandes, Joao C. ;
Flor, Luisa Sorio ;
Foigt, Nataliya A. ;
Forouzanfar, Mohammad H. ;
Ganji, Morsaleh ;
Geleijnse, Johanna M. ;
Gillum, Richard F. ;
Goulart, Alessandra C. ;
Grosso, Giuseppe ;
Guessous, Idris ;
Hamidi, Samer ;
Hankey, Graeme J. .
LANCET, 2019, 393 (10184) :1958-1972
[2]   Large-scale diet tracking data reveal disparate associations between food environment and diet [J].
Althoff, Tim ;
Nilforoshan, Hamed ;
Hua, Jenna ;
Leskovec, Jure .
NATURE COMMUNICATIONS, 2022, 13 (01)
[3]  
[Anonymous], Plotly: Low-code data app development
[4]  
Aziguli, 2017, 2017 IEEE 19th International Conference on e-Health Networking, Applications and Services (Healthcom), P1, DOI 10.1109/HealthCom.2017.8210794
[5]   Exploiting semantic patterns over biomedical knowledge graphs for predicting treatment and causative relations [J].
Bakal, Gokhan ;
Talari, Preetham ;
Kakani, Elijah, V ;
Kavuluru, Ramakanth .
JOURNAL OF BIOMEDICAL INFORMATICS, 2018, 82 :189-199
[6]   Knowledge-Based Biomedical Data Science [J].
Callahan, Tiffany J. ;
Tripodi, Ignacio J. ;
Pielke-Lombardo, Harrison ;
Hunter, Lawrence E. .
ANNUAL REVIEW OF BIOMEDICAL DATA SCIENCE, VOL 3, 2020, 2020, 3 :23-41
[7]  
Cenikj G., 2021, P 20 WORKSH BIOM LAN, P30, DOI [10.18653/v1/2021.bionlp-1.4, DOI 10.18653/V1/2021.BIONLP-1.4]
[8]   FoodChem: A food-chemical relation extraction model [J].
Cenikj, Gjorgjina ;
Seljak, Barbara Korousic ;
Eftimov, Tome .
2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
[9]   BuTTER: BidirecTional LSTM for Food Named-Entity Recognition [J].
Cenikj, Gjorgjina ;
Popovski, Gorjan ;
Stojanov, Riste ;
Seljak, Barbara Korousic ;
Eftimov, Tome .
2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, :3550-3556
[10]   Automated acquisition of disease-drug knowledge from biomedical and clinical documents: An initial study [J].
Chen, Elizabeth S. ;
Hripcsak, George ;
Xu, Hua ;
Markatou, Marianthi ;
Friedman, Carol .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2008, 15 (01) :87-98