Named Entity Recognition Utilized to Enhance Text Classification While Preserving Privacy

被引:4
|
作者
Kutbi, Mohammed [1 ]
机构
[1] Saudi Elect Univ, Comp Sci Dept, Jeddah 23445, Saudi Arabia
关键词
Named entities; preprocessing; text classification; privacy;
D O I
10.1109/ACCESS.2023.3325895
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent development in Natural Language Processing (NLP) techniques has encouraged NLP-based application in various field including business, legal and health. An important process for all NLP projects is text preprocessing which is a process that modifies text data before using them in a machine learning model. Usually text preprocessing process includes cleaning, filtering, removing and replacing some texts to increase model accuracy, robustness, reduce data size or preserve privacy. Named entities recognizer (NER) is an NLP tool which finds Named Entities in text such as: names, organization, addresses, numbers and date. In this work, we create a preproccessing approach that uses NER to find named entities and, then, replace them with their type i.e. location, person or organization name to improve accuracy and preserve privacy instead of removing them or letting them become noise to our data. Experiments for text classification task using our approach have been conducted on several datasets some of which were collected in-house. Experiments indicate that using this approach enhances classifier accuracy and reduces feature representation's dimensionality while, also, preserve privacy.
引用
收藏
页码:117576 / 117581
页数:6
相关论文
共 50 条
  • [21] A Survey of Arabic Named Entity Recognition and Classification
    Shaalan, Khaled
    COMPUTATIONAL LINGUISTICS, 2014, 40 (02) : 469 - 510
  • [22] CLASSIFICATION ATTENTION FOR CHINESE NAMED ENTITY RECOGNITION
    Cong, Kai
    Wang, Yunpeng
    Li, Tao
    Xu, Yanbin
    JOURNAL OF NONLINEAR AND CONVEX ANALYSIS, 2021, 22 (09) : 1675 - 1686
  • [23] Nested named entity recognition in historical archive text
    Byrne, Kate
    ICSC 2007: INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, PROCEEDINGS, 2007, : 589 - 596
  • [24] A Hybrid Named Entity Recognition System for Aviation Text
    Bharathi, A.
    Ramdin, Robin
    Babu, Preeja
    Menon, Vijay Krishna
    Jayaramakrishnan, Chandrasekhar
    Lakshmikumar, Sudarsan
    EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2024, 11 (01)
  • [25] Named Entity Recognition in Unstructured Medical Text Documents
    Pearson, Cole
    Seliya, Naeem
    Dave, Rushit
    INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND ENERGY TECHNOLOGIES (ICECET 2021), 2021, : 412 - 417
  • [26] Named Entity Recognition for Russian Judicial Rulings Text
    Averina, Maria
    Levanova, Olga
    Kasatkina, Natalia
    2022 32ND CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT), 2022, : 49 - 55
  • [27] Named Entity Recognition in Twitter Using Images and Text
    Esteves, Diego
    Peres, Rafael
    Lehmann, Jens
    Napolitano, Giulio
    CURRENT TRENDS IN WEB ENGINEERING, ICWE 2017, 2018, 10544 : 191 - 199
  • [28] Named Entity Recognition Method for Process Planning Text
    Dong H.
    Li Y.
    Qiao L.
    Huang Z.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2024, 36 (02): : 313 - 320
  • [29] Enhance while protecting: privacy preserving image filtering
    Arcelli, Diego
    Baia, Alina Elena
    Milani, Alfredo
    Poggioni, Valentina
    2021 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2021), 2021, : 647 - 652
  • [30] A Named Entity Recognition Based Approach for Privacy Requirements Engineering
    Herwanto, Guntur Budi
    Quirchmayr, Gerald
    Tjoa, A. Min
    29TH IEEE INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE WORKSHOPS (REW 2021), 2021, : 406 - 411