Named Entity Recognition Utilized to Enhance Text Classification While Preserving Privacy

被引:4
|
作者
Kutbi, Mohammed [1 ]
机构
[1] Saudi Elect Univ, Comp Sci Dept, Jeddah 23445, Saudi Arabia
关键词
Named entities; preprocessing; text classification; privacy;
D O I
10.1109/ACCESS.2023.3325895
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent development in Natural Language Processing (NLP) techniques has encouraged NLP-based application in various field including business, legal and health. An important process for all NLP projects is text preprocessing which is a process that modifies text data before using them in a machine learning model. Usually text preprocessing process includes cleaning, filtering, removing and replacing some texts to increase model accuracy, robustness, reduce data size or preserve privacy. Named entities recognizer (NER) is an NLP tool which finds Named Entities in text such as: names, organization, addresses, numbers and date. In this work, we create a preproccessing approach that uses NER to find named entities and, then, replace them with their type i.e. location, person or organization name to improve accuracy and preserve privacy instead of removing them or letting them become noise to our data. Experiments for text classification task using our approach have been conducted on several datasets some of which were collected in-house. Experiments indicate that using this approach enhances classifier accuracy and reduces feature representation's dimensionality while, also, preserve privacy.
引用
收藏
页码:117576 / 117581
页数:6
相关论文
共 50 条
  • [1] Named entity recognition and classification for text in arabic
    Abuleil, S
    Evens, M
    INTELLIGENT AND ADAPTIVE SYSTEMS AND SOFTWARE ENGINEERING, 2004, : 89 - 94
  • [2] Privacy-preserving mimic models for clinical named entity recognition in French
    Bannour, Nesrine
    Wajsburt, Perceval
    Rance, Bastien
    Tannier, Xavier
    Neveol, Aurelie
    JOURNAL OF BIOMEDICAL INFORMATICS, 2022, 130
  • [3] Radar technical language modeling with named entity recognition and text classification
    Zaunegger, Jackson S.
    Singerman, Paul G.
    Narayanan, Ram M.
    O'Rourke, Sean M.
    Rangaswamy, Muralidhar
    RADAR SENSOR TECHNOLOGY XXVI, 2022, 12108
  • [4] Named entity recognition and classification in biomedical text using classifier ensemble
    Saha, Sriparna
    Ekbal, Asif
    Sikdar, Utpal Kumar
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2015, 11 (04) : 365 - 391
  • [5] Named Entity Recognition in Clinical Text Based on Capsule-LSTM for Privacy Protection
    Liu, Changjian
    Li, Jiaming
    Liu, Yuhan
    Du, Jiachen
    Tang, Buzhou
    Xu, Ruifeng
    ARTIFICIAL INTELLIGENCE AND MOBILE SERVICES - AIMS 2019, 2019, 11516 : 166 - 178
  • [6] Named Entity Recognition and Classification in Galician
    Garcia, Marcos
    Gayo, Iria
    Gonzalez Lopez, Isaac
    ESTUDOS DE LINGUISTICA GALEGA, 2012, 4 : 13 - 25
  • [7] An Association Rule Mining Method Based on Named Entity Recognition and Text Classification
    He, Bo
    Zhang, Jiru
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2023, 48 (02) : 1503 - 1511
  • [8] An Association Rule Mining Method Based on Named Entity Recognition and Text Classification
    Bo He
    Jiru Zhang
    Arabian Journal for Science and Engineering, 2023, 48 : 1503 - 1511
  • [9] A survey of named entity recognition and classification
    Nadeau, David
    Sekine, Satoshi
    LINGUISTICAE INVESTIGATIONES, 2007, 30 (01): : 3 - 26
  • [10] A Perspective on Text Classification, Clustering, and Named-entity Recognition in Social Media
    Jahanbin, Kia
    Rahmanian, Fereshte
    Rahmanian, Vahid
    Shakeri, Masihollah
    Shakeri, Heshmatollah
    Rahmaniani, Zhila
    Jahromi, Abdolreza Sotoodeh
    AMBIENT SCIENCE, 2019, 6 (01) : 1 - 4