ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction

被引:0
作者
Rusnachenko, Nicolay [1 ]
Liang, Huizhi [1 ]
Kalameyets, Maksim [1 ]
Shi, Lei [1 ]
机构
[1] Newcastle Univ, Sch Comp, Newcastle Upon Tyne, Tyne & Wear, England
来源
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT V | 2024年 / 14612卷
基金
英国科研创新办公室;
关键词
Data Processing Pipeline; Information Retrieval; Visualisation;
D O I
10.1007/978-3-031-56069-9_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The escalating volume of textual data necessitates adept and scalable Information Extraction (IE) systems in the field of Natural Language Processing (NLP) to analyse massive text collections in a detailed manner. While most deep learning systems are designed to handle textual information as it is, the gap in the existence of the interface between a document and the annotation of its parts is still poorly covered. Concurrently, one of the major limitations of most deep-learning models is a constrained input size caused by architectural and computational specifics. To address this, we introduce ARElight(1), a system designed to efficiently manage and extract information from sequences of large documents by dividing them into segments with mentioned object pairs. Through a pipeline comprising modules for text sampling, inference, optional graph operations, and visualisation, the proposed system transforms large volumes of text in a structured manner. Practical applications of ARElight are demonstrated across diverse use cases, including literature processing and social network analysis.((1)https://github.com/nicolay-r/ARElight)
引用
收藏
页码:229 / 235
页数:7
相关论文
共 28 条
  • [1] Adel H, 2018, CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, P42
  • [2] Boski M, 2017, 2017 10TH INTERNATIONAL WORKSHOP ON MULTIDIMENSIONAL (ND) SYSTEMS (NDS)
  • [3] Bostock M., 2023, D3js gallery: Hierarchical edge bundling
  • [4] Burtsev M, 2018, 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2018): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, P122
  • [5] Choi E, 2016, PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, P333
  • [6] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [7] Gamma E., 1995, DESIGN PATTERNS ELEM
  • [8] Han X, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, P169
  • [9] Han X, 2018, AAAI CONF ARTIF INTE, P4832
  • [10] Hendrickx I., 2010, P WORKSHOP SEMANTIC, P33