Medical data mining in sentiment analysis based on optimized swarm search feature selection

被引:0
|
作者
Daohui Zeng
Jidong Peng
Simon Fong
Yining Qiu
Raymond Wong
机构
[1] First Affiliated Hospital of Guangzhou University of TCM,Department of Computer and Information Science
[2] Ganzhou People’s Hospital,School of Computer Science and Engineering
[3] University of Macau,undefined
[4] University of New South Wales,undefined
来源
Australasian Physical & Engineering Sciences in Medicine | 2018年 / 41卷
关键词
Medical text mining; Optimized swarm search-based feature selection; Sentiment prediction; Clustering-by-coefficient-of-variation;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, we propose a novel technique termed as optimized swarm search-based feature selection (OS-FS), which is a swarm-type of searching function that selects an ideal subset of features for enhanced classification accuracy. In terms of gaining insights from unstructured medical based texts, sentiment prediction is becoming an increasingly crucial machine learning technique. In fact, due to its robustness and accuracy, it recently gained popularity in the medical industries. Medical text mining is well known as a fundamental data analytic for sentiment prediction. To form a high-dimensional sparse matrix, a popular preprocessing step in text mining is employed to transform medical text strings to word vectors. However, such a sparse matrix poses problems to the induction of accurate sentiment prediction model. The swarm search in our proposed OS-FS can be optimized by a new feature evaluation technique called clustering-by-coefficient-of-variation. In order to find a subset of features from all the original features from the sparse matrix, this type of feature selection has been a commonly utilized dimensionality reduction technique, and has the capability to improve accuracy of the prediction model. We implement this method based on a case scenario where 279 medical articles related to ‘meaningful use functionalities on health care quality, safety, and efficiency’ from a systematic review of previous medical IT literature. For this medical text mining, a multi-class of sentiments, positive, mixed-positive, neutral and negative is recognized from the document contents. Our experimental results demonstrate the superiority of OS-FS over traditional feature selection methods in literature.
引用
收藏
页码:1087 / 1100
页数:13
相关论文
共 50 条
  • [1] Medical data mining in sentiment analysis based on optimized swarm search feature selection
    Zeng, Daohui
    Peng, Jidong
    Fong, Simon
    Qiu, Yining
    Wong, Raymond
    AUSTRALASIAN PHYSICAL & ENGINEERING SCIENCES IN MEDICINE, 2018, 41 (04) : 1087 - 1100
  • [2] Optimized Swarm Search-based Feature Selection for Text Mining in Sentiment Analysis
    Fong, Simon
    Gao, Elisa
    Wong, Raymond
    2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2015, : 1153 - 1162
  • [3] Accelerated PSO Swarm Search Feature Selection for Data Stream Mining Big Data
    Fong, Simon
    Wong, Raymond
    Vasilakos, Athanasios V.
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2016, 9 (01) : 33 - 45
  • [4] Sentiment Analysis Using Cuckoo Search for Optimized Feature Selection on Kaggle Tweets
    Kumar, Akshi
    Jaiswal, Arunima
    Garg, Shikhar
    Verma, Shobhit
    Kumar, Siddhant
    INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2019, 9 (01) : 1 - 15
  • [5] Particle Swarm Optimized Feature Selection For Retrieving Compressed Medical Images
    Vamsidhar, Enireddy
    Kenny, M. John
    Gunna, Kishore
    2017 INTERNATIONAL CONFERENCE ON ALGORITHMS, METHODOLOGY, MODELS AND APPLICATIONS IN EMERGING TECHNOLOGIES (ICAMMAET), 2017,
  • [6] A survey on swarm intelligence approaches to feature selection in data mining
    Bach Hoai Nguyen
    Xue, Bing
    Zhang, Mengjie
    SWARM AND EVOLUTIONARY COMPUTATION, 2020, 54
  • [7] A NOVEL SENTIMENT ANALYSIS FOR AMAZON DATA WITH TSA BASED FEATURE SELECTION
    Daniel, Anand Joseph D.
    Meena, Janaki M.
    SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2021, 22 (01): : 53 - 66
  • [8] A novel sentiment analysis for amazon data with TSA based feature selection
    DANIEL D. A.J.
    M. J.M.
    Scalable Computing, 2021, 22 (01): : 53 - 66
  • [9] Data Stream Mining in Fog Computing Environment with Feature Selection Using Ensemble of Swarm Search Algorithms
    Ma, Bin Bin
    Fong, Simon
    Millham, Richard
    2018 CONFERENCE ON INFORMATION COMMUNICATIONS TECHNOLOGY AND SOCIETY (ICTAS), 2018,
  • [10] Particle swarm optimization-based feature selection in sentiment classification
    Lin Shang
    Zhe Zhou
    Xing Liu
    Soft Computing, 2016, 20 : 3821 - 3834