CNERVis: a visual diagnosis tool for Chinese named entity recognition

被引：2

作者：

Lo, Pei-Shan ^{[1
]}

Wu, Jian-Lin ^{[1
]}

Deng, Syu-Ting ^{[1
]}

Wang, Ko-Chih ^{[1
]}

机构：

[1] Natl Taiwan Normal Univ, Dept Comp Sci & Informat Engn, Taipei, Taiwan

来源：

JOURNAL OF VISUALIZATION | 2022年 / 25卷 / 03期

关键词：

visual analytics; Chinese named entity recognition; natural language processing; BiLSTM; sequence labeling;

D O I：

10.1007/s12650-021-00799-3

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Named entity recognition (NER) is a crucial initial task that identifies both spans and types of named entities to extract the specific information, such as organization, person, location, and time. Nowadays, the NER task achieves state-of-the-art performance by deep learning approaches for capturing contextual features. However, the complex structures of deep learning make a black-box problem and limit researchers' ability to improve it. Unlike the Latin alphabet, Chinese (or other languages such as Korean and Japanese) do not have an explicit word boundary. Therefore, some preliminary works, such as word segmentation (WS) and part-of-speech tagging (POS), are needed before the Chinese NER task. The correctness of preliminary works importantly influences the final NER prediction. Thus, investigating the model behavior of the Chinese NER task becomes more complicated and challenging. In this paper, we present CNERVis, a visual analysis tool that allows users to interactively inspect the WS-POS-NER pipeline and understand how and why a NER prediction is made. Also, CNERVis allows users to load the numerous testing data and explores the critical instances to facilitate the analysis from large datasets. Our tool's usability and effectiveness are demonstrated through case studies.

引用

页码：653 / 669

页数：17

共 34 条

[1]

Abadi M, 2016, ACM SIGPLAN NOTICES, V51, P1, DOI [10.1145/3022670.2976746, 10.1145/2951913.2976746]

[2]

Chiu Jason PC, 2016, Transactions of the Association for Computational Linguistics, V4, P357, DOI [DOI 10.1162/TACLA00104, DOI 10.1162/TACL_A_00104]

[3]

Cho K., 2014, P 8 WORKSH SYNT SEM, DOI [10.3115/v1/W14-4012, DOI 10.3115/V1/W14-4012]

[4] WEDL-NIDS: Improving Network Intrusion Detection Using Word Embedding-Based Deep Learning Method [J].

Cui, Jianjing ;

Long, Jun ;

Min, Erxue ;

Mao, Yugang .

MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE (MDAI 2018), 2018, 11144 :283-295

[5]

Devlin Jacob, 2018, P C N AM CHAPT ASS C

[6] The State of the Art in Integrating Machine Learning into Visual Analytics [J].

Endert, A. ;

Ribarsky, W. ;

Turkay, C. ;

Wong, B. L. William ;

Nabney, I. ;

Diaz Blanco, I. ;

Rossi, F. .

COMPUTER GRAPHICS FORUM, 2017, 36 (08) :458-486

[7]

Ethayarajh K, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P1696

[8] Deep neural network for hierarchical extreme multi-label text classification [J].

Gargiulo, Francesco ;

Silvestri, Stefano ;

Ciampi, Mario ;

De Pietro, Giuseppe .

APPLIED SOFT COMPUTING, 2019, 79 :125-138

[9] Learning to forget: Continual prediction with LSTM [J].

Gers, FA ;

Schmidhuber, J ;

Cummins, F .

NEURAL COMPUTATION, 2000, 12 (10) :2451-2471

[10]

Gillick Dan, 2014, CORR

← 1 2 3 4 →