Transformer-based approach to variable typing

被引:0
作者
Rey, Charles Arthel [1 ]
Danguilan, Jose Lorenzo [1 ]
Mendoza, Karl Patrick [1 ]
Remolona, Miguel Francisco [1 ]
机构
[1] Univ Philippines Diliman, Dept Chem Engn, Chem Engn Intelligence Learning Lab, Quezon City 1101, Philippines
关键词
Natural language processing; Transformers; Entity recognition; Relation extraction; Variable typing; Machine learning; Mathematical knowledge; NAMED ENTITY RECOGNITION;
D O I
10.1016/j.heliyon.2023.e20505
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The upsurge of multifarious endeavors across scientific fields propelled Big Data in the scientific domain. Despite the advancements in management systems, researchers find that mathematical knowledge remains one of the most challenging to manage due to the latter's inherent heterogeneity. One novel recourse being explored is variable typing where current works remain liminary and, thus, provide a wide room for contribution. In this study, a primordial attempt implement the end-to-end Entity Recognition (ER) and Relation Extraction (RE) approach variable typing was made using the BERT (Bidirectional Encoder Representations from Transformers) model. A micro-dataset was developed for this process. According to our findings, the model and RE model, respectively, have Precision of 0.8142 and 0.4919, Recall of 0.7816 0.6030, and F1-Scores of 0.7975 and 0.5418. Despite the limited dataset, the models performed par with values in the literature. This work also discusses the factors affecting this BERT-based
引用
收藏
页数:12
相关论文
共 41 条
[1]  
Altbach PG, 2018, International Higher Education, P2, DOI [10.6017/ihe.2019.96.10767, DOI 10.6017/IHE.2019.96.10767, 10.6017/ihe.2019.96.10767]
[2]   Automated extraction of information in molecular biology [J].
Andrade, MA ;
Bork, P .
FEBS LETTERS, 2000, 476 (1-2) :12-17
[3]  
Boon S., 2017, 21st Century Science Overload
[4]  
Charnois T, 2006, WORKSH DAT TEXT MIN, P4
[5]   Fast and effective biomedical named entity recognition using temporal convolutional network with conditional random field [J].
Che, Chao ;
Zhou, Chengjie ;
Zhao, Hanyu ;
Jin, Bo ;
Gao, Zhan .
MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2020, 17 (04) :3553-3566
[6]  
Demchenko Y, 2013, PROCEEDINGS OF THE 2013 INTERNATIONAL CONFERENCE ON COLLABORATION TECHNOLOGIES AND SYSTEMS (CTS), P48
[7]  
Devlin J, 2019, Arxiv, DOI [arXiv:1810.04805, DOI 10.48550/ARXIV.1810.04805]
[8]  
Ferreira D, 2022, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), P938
[9]  
Grigore M, 2009, AS S COMP MATH MATH
[10]  
GuoDong Zhou., 2005, Proceedings of the 43rd annual meeting on Association for Computational Linguistics, P427