Data-Driven Regular Expressions Evolution for Medical Text Classification Using Genetic Programming

被引:0
|
作者
Liu, Jiandong [1 ]
Bai, Ruibin [1 ]
Lu, Zheng [1 ]
Ge, Peiming [2 ]
Aickelin, Uwe [3 ]
Liu, Daoyun [2 ]
机构
[1] Univ Nottingham Ningbo China, Sch Comp Sci, Ningbo, Peoples R China
[2] Ping An Hlth Cloud Co Ltd China, Techonol Dept, Shanghai, Peoples R China
[3] Univ Melbourne, Sch Comp & Informat Syst, Melbourne, Vic, Australia
来源
2020 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC) | 2020年
关键词
text classification; genetic programming; co-occurrence matrix; EXPERT-SYSTEM;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In medical fields, text classification is one of the most important tasks that can significantly reduce human workload through structured information digitization and intelligent decision support. Despite the popularity of learning-based text classification techniques, it is hard for human to understand or manually fine-tune the classification for better precision and recall, due to the black box nature of learning. This study proposes a novel regular expression-based text classification method making use of genetic programming (GP) approaches to evolve regular expressions that can classify a given medical text inquiry with satisfaction. Given a seed population of regular expressions (randomly initialized or manually constructed by experts), our method evolves a population of regular expressions, using a novel regular expression syntax and a series of carefully chosen reproduction operators. Our method is evaluated with real-life medical text inquiries from an online healthcare provider and shows promising performance. More importantly, our method generates classifiers that can be fully understood, checked and updated by medical doctors, which are fundamentally crucial for medical related practices.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Data-Driven Identification of Crane Dynamics Using Regularized Genetic Programming
    Kusznir, Tom
    Smoczek, Jaroslaw
    Karwat, Boleslaw
    APPLIED SCIENCES-BASEL, 2024, 14 (08):
  • [2] Medical Data Classification Using Genetic Programming: A Systematic Literature Review
    Maurya, Pratibha
    Kushwaha, Arati
    Prakash, Om
    EXPERT SYSTEMS, 2025, 42 (03)
  • [3] Upgrades of Genetic Programming for Data-Driven Modeling of Time Series
    Murari, A.
    Peluso, E.
    Spolladore, L.
    Rossi, R.
    Gelfusa, M.
    EVOLUTIONARY COMPUTATION, 2023, 31 (04) : 401 - 432
  • [4] Data-driven approach to learning salience models of indoor landmarks by using genetic programming
    Hu, Xuke
    Ding, Lei
    Shang, Jianga
    Fan, Hongchao
    Novack, Tessio
    Noskov, Alexey
    Zipf, Alexander
    INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2020, 13 (11) : 1230 - 1257
  • [5] Data-driven Modelling of Dynamical Systems Using Tree Adjoining Grammar and Genetic Programming
    Khandelwal, Dhruv
    Schoukens, Maarten
    Toth, Roland
    2019 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2019, : 2673 - 2680
  • [6] Data-driven Feature Selection Methods for Text Classification: an Empirical Evaluation
    Fragoso, Rogerio C. P.
    Pinheiro, Roberto H. W.
    Cavalcanti, George D. C.
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2019, 25 (04) : 334 - 360
  • [7] Establishing a data-driven strength model for ??????-tin by performing symbolic regression using genetic programming
    Zapiain, David Montes de Oca
    Lane, J. Matthew D.
    Carroll, Jay D.
    Casias, Zachary
    Battaile, Corbett C.
    Fensin, Saryu
    Lim, Hojun
    COMPUTATIONAL MATERIALS SCIENCE, 2023, 218
  • [8] A novel fitness function in genetic programming for medical data classification
    Kumar, Arvind
    Sinha, Nishant
    Bhardwaj, Arpit
    JOURNAL OF BIOMEDICAL INFORMATICS, 2020, 112
  • [9] Creating deep neural networks for text classification tasks using grammar genetic programming
    Magalhaes, Dimmy
    Lima, Ricardo H. R.
    Pozo, Aurora
    APPLIED SOFT COMPUTING, 2023, 135
  • [10] Automatic Generation of Regular Expressions from Examples with Genetic Programming
    Bartoli, Alberto
    Davanzo, Giorgio
    De Lorenzo, Andrea
    Mauri, Marco
    Medvet, Eric
    Sorio, Enrico
    PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTATION COMPANION (GECCO'12), 2012, : 1477 - 1478