TourismNER: A Tourism Named Entity Recognition method based on entity boundary joint prediction

被引：0

作者：

Gao, Kai ^{[1
]}

Zhou, Jiahao ^{[1
,2
]}

Chi, Yunxian ^{[1
]}

Wen, Yimin ^{[2
,3
]}

机构：

[1] Hebei Univ Sci & Technol, Sch Informat Sci & Engn, Shijiazhuang 050018, Peoples R China

[2] Guilin Tourism Univ, Guangxi Key Lab Culture & Tourism Smart Technol, Guilin 541006, Peoples R China

[3] Guilin Univ Elect Technol, Guangxi Key Lab Image & Graph Intelligent Proc, Guilin 541004, Peoples R China

来源：

INTELLIGENT SYSTEMS WITH APPLICATIONS | 2025年 / 25卷

关键词：

Natural Language Processing; Tourism Named Entity Recognition; Entity boundary recognition; Joint prediction;

D O I：

10.1016/j.iswa.2025.200475

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Tourism named entity recognition is indispensable in tourism information extraction, and plays a crucial role in constructing tourism knowledge map and enhancing tourism knowledge quiz system. The difficulty of tourism named entity recognition lies in its complex nested structure, and the lengthy entity naming length. To address these existing problems, we propose a tourism named entity recognition model that jointly predicts entity boundaries, adopting a training strategy of data preprocessing to enhance the model's ability for tourism named entity boundary recognition, while our model introduces a pre-trained Bert model as well as BiLSTM coding to enhance the representation of the model's contexts, and uses a combined predictor of Biaffine and MLP to enhance the model's recognition performance for boundaries, as well as introducing label smoothing cross entropy to smooth the target labels during the training process. Experiments are conducted on three datasets with different granularities. From the analysis of the experimental results, it can be seen that the named entity recognition method achieves higher accuracy and F1 value compared with the optimal baseline model, and also proves the effectiveness and generality of the modeling method proposed in this paper.

引用

页数：7

共 39 条

[1]

Cauter Z., 2024, P 1 WORKSH KNOWL GRA

[2]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[3]

Du X., 2024, P 2024 JOINT INT C C

[4]

Fei H, 2021, AAAI CONF ARTIF INTE, V35, P12785

[5]

Giorgi J, 2022, PROCEEDINGS OF THE 21ST WORKSHOP ON BIOMEDICAL LANGUAGE PROCESSING (BIONLP 2022), P10

[6] Unleashing the True Potential of Sequence-to-Sequence Models for Sequence Tagging and Structure Parsing [J].

He, Han ;

Choi, Jinho D. .

TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 :582-599

[7]

Jiang P., 2022, P 2022 C EMPIRICAL M

[8] SpanBERT: Improving Pre-training by Representing and Predicting Spans [J].

Joshi, Mandar ;

Chen, Danqi ;

Liu, Yinhan ;

Weld, Daniel S. ;

Zettlemoyer, Luke ;

Levy, Omer .

TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 :64-77

[9]

Karthikeyan K., 2023, P 2023 C EMPIRICAL M

[10]

Lee W, 2018, NAMED ENTITIES, P9

← 1 2 3 4 →