Exploring a deep learning neural architecture for closed Literature-based discovery

被引:2
作者
Cuffy, Clint [1 ]
McInnes, Bridget T. [1 ]
机构
[1] Virginia Commonwealth Univ, 401 S Main St, Richmond, VA 23284 USA
关键词
Natural language processing; Literature-based discovery; Literature-related discovery; Neural networks; Deep learning; Knowledge discovery; POTENTIAL TREATMENTS; FISH-OIL; SYSTEM; LRD;
D O I
10.1016/j.jbi.2023.104362
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Scientific literature presents a wealth of information yet to be explored. As the number of researchers increase with each passing year and publications are released, this contributes to an era where specialized fields of research are becoming more prevalent. As this trend continues, this further propagates the separation of interdisciplinary publications and makes keeping up to date with literature a laborious task. Literature-based discovery (LBD) aims to mitigate these concerns by promoting information sharing among non-interacting literature while extracting potentially meaningful information. Furthermore, recent advances in neural network architectures and data representation techniques have fueled their respective research communities in achiev-ing state-of-the-art performance in many downstream tasks. However, studies of neural network-based methods for LBD remain to be explored. We introduce and explore a deep learning neural network-based approach for LBD. Additionally, we investigate various approaches to represent terms as concepts and analyze the affect of feature scaling representations into our model. We compare the evaluation performance of our method on five hallmarks of cancer datasets utilized for closed discovery. Our results show the chosen representation as input into our model affects evaluation performance. We found feature scaling our input representations increases evaluation performance and decreases the necessary number of epochs needed to achieve model generalization. We also explore two approaches to represent model output. We found reducing the model's output to capturing a subset of concepts improved evaluation performance at the cost of model generalizability. We also compare the efficacy of our method on the five hallmarks of cancer datasets to a set of randomly chosen relations between concepts. We found these experiments confirm our method's suitability for LBD.
引用
收藏
页数:14
相关论文
共 40 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] Ahlers Caroline B, 2007, AMIA Annu Symp Proc, P6
  • [3] The Unified Medical Language System (UMLS): integrating biomedical terminology
    Bodenreider, O
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : D267 - D270
  • [4] Bordes Antoine, 2013, PROCADV NEURAL INF P, V26
  • [5] Neural networks for open and closed Literature-based Discovery
    Crichton, Gamal
    Baker, Simon
    Guo, Yufan
    Korhonen, Anna
    [J]. PLOS ONE, 2020, 15 (05):
  • [6] Oncogene-induced Nrf2 transcription promotes ROS detoxification and tumorigenesis
    De Nicola, Gina M.
    Karreth, Florian A.
    Humpton, Timothy J.
    Gopinathan, Aarthi
    Wei, Cong
    Frese, Kristopher
    Mangal, Dipti
    Yu, Kenneth H.
    Yeo, Charles J.
    Calhoun, Eric S.
    Scrimieri, Francesca
    Winter, Jordan M.
    Hruban, Ralph H.
    Iacobuzio-Donahue, Christine
    Kern, Scott E.
    Blair, Ian A.
    Tuveson, David A.
    [J]. NATURE, 2011, 475 (7354) : 106 - U128
  • [7] ChEBI:: a database and ontology for chemical entities of biological interest
    Degtyarenko, Kirill
    de Matos, Paula
    Ennis, Marcus
    Hastings, Janna
    Zbinden, Martin
    McNaught, Alan
    Alcantara, Rafael
    Darsow, Michael
    Guedj, Mickael
    Ashburner, Michael
    [J]. NUCLEIC ACIDS RESEARCH, 2008, 36 : D344 - D350
  • [8] Demsar J, 2006, J MACH LEARN RES, V7, P1
  • [9] FISH-OIL DIETARY SUPPLEMENTATION IN PATIENTS WITH RAYNAUD PHENOMENON - A DOUBLE-BLIND, CONTROLLED, PROSPECTIVE-STUDY
    DIGIACOMO, RA
    KREMER, JM
    SHAH, DM
    [J]. AMERICAN JOURNAL OF MEDICINE, 1989, 86 (02) : 158 - 164
  • [10] The NCBI Taxonomy database
    Federhen, Scott
    [J]. NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) : D136 - D143