A comparison of rule-based and machine learning approaches for classifying patient portal messages

被引:42
|
作者
Cronin, Robert M. [1 ,2 ,3 ]
Fabbri, Daniel [1 ,4 ]
Denny, Joshua C. [1 ,2 ]
Rosenbloom, S. Trent [1 ,2 ,3 ]
Jackson, Gretchen Purcell [1 ,3 ,5 ]
机构
[1] Vanderbilt Univ, Med Ctr, Dept Biomed Informat, 2525 West End Blvd,Suite 1475, Nashville, TN 37232 USA
[2] Vanderbilt Univ, Med Ctr, Dept Med, Nashville, TN 37232 USA
[3] Vanderbilt Univ, Med Ctr, Dept Pediat, Nashville, TN 37232 USA
[4] Vanderbilt Univ, Dept Comp Sci, Nashville, TN 37232 USA
[5] Vanderbilt Univ, Med Ctr, Dept Pediat Surg, Nashville, TN 37232 USA
关键词
Patient portal; Text classification; Natural language processing; Machine learning; TEXT CLASSIFICATION; SECURE MESSAGES; PRIMARY-CARE; HEALTH-CARE; INTERVENTIONS; COMMUNICATION; INFORMATION; COMPUTER;
D O I
10.1016/j.ijmedinf.2017.06.004
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: Secure messaging through patient portals is an increasingly popular way that consumers interact with healthcare providers. The increasing burden of secure messaging can affect clinic staffing and workflows. Manual management of portal messages is costly and time consuming. Automated classification of portal messages could potentially expedite message triage and delivery of care. Materials and methods: We developed automated patient portal message classifiers with rule-based and machine learning techniques using bag of words and natural language processing (NLP) approaches. To evaluate classifier performance, we used a gold standard of 3253 portal messages manually categorized using a taxonomy of communication types (i.e., main categories of informational, medical, logistical, social, and other communications, and subcategories including prescriptions, appointments, problems, tests, follow-up, contact information, and acknowledgement). We evaluated our classifiers' accuracies in identifying individual communication types within portal messages with area under the receiver-operator curve (AUC). Portal messages often contain more than one type of communication. To predict all communication types within single messages, we used the Jaccard Index. We extracted the variables of importance for the random forest classifiers. Results: The best performing approaches to classification for the major communication types were: logistic regression for medical communications (AUC: 0.899); basic (rule-based) for informational communications (AUC: 0.842); and random forests for social communications and logistical communications (AUCs: 0.875 and 0.925, respectively). The best performing classification approach of classifiers for individual communication subtypes was random forests for Logistical-Contact Information (AUC: 0.963). The Jaccard Indices by approach were: basic classifier, Jaccard Index: 0.674; Naive Bayes, Jaccard Index: 0.799; random forests, Jaccard Index: 0.859; and logistic regression, Jaccard Index: 0.861. For medical communications, the most predictive variables were NLP concepts (e.g., Temporal Concept, which maps to 'morning', 'evening' and Idea or Concept which maps to 'appointment' and 'refill'). For logistical communications, the most predictive variables contained similar numbers of NLP variables and words (e.g., Telephone mapping to 'phone', 'insurance'). For social and informational communications, the most predictive variables were words (e.g., social: 'thanks', 'much', informational: 'question', 'mean'). Conclusions: This study applies automated classification methods to the content of patient portal messages and evaluates the application of NLP techniques on consumer communications in patient portal messages. We demonstrated that random forest and logistic regression approaches accurately classified the content of portal messages, although the best approach to classification varied by communication type. Words were the most predictive variables for classification of most communication types, although NLP variables were most predictive for medical communication types. As adoption of patient portals increases, automated techniques could assist in understanding and managing growing volumes of messages. Further work is needed to improve classification performance to potentially support message triage and answering.
引用
收藏
页码:110 / 120
页数:11
相关论文
共 50 条
  • [41] A rule-based machine learning methodology for the proactive improvement of OEE: a real case study
    Lucantoni, Laura
    Antomarioni, Sara
    Ciarapica, Filippo Emanuele
    Bevilacqua, Maurizio
    INTERNATIONAL JOURNAL OF QUALITY & RELIABILITY MANAGEMENT, 2024, 41 (05) : 1356 - 1376
  • [42] Negation detection in Dutch clinical texts: an evaluation of rule-based and machine learning methods
    Bram van Es
    Leon C. Reteig
    Sander C. Tan
    Marijn Schraagen
    Myrthe M. Hemker
    Sebastiaan R. S. Arends
    Miguel A. R. Rios
    Saskia Haitjema
    BMC Bioinformatics, 24
  • [43] FOLD-SE: An Efficient Rule-Based Machine Learning Algorithm with Scalable Explainability
    Wang, Huaduo
    Gupta, Gopal
    PRACTICAL ASPECTS OF DECLARATIVE LANGUAGES, PADL 2024, 2023, 14512 : 37 - 53
  • [44] Combining unsupervised, supervised and rule-based learning: the case of detecting patient allergies in electronic health records
    Berge, Geir Thore
    Granmo, Ole-Christoffer
    Tveit, Tor Oddbjorn
    Ruthjersen, Anna Linda
    Sharma, Jivitesh
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2023, 23 (01)
  • [45] Real-Time Driver Behaviour Characterization Through Rule-Based Machine Learning
    Martinelli, Fabio
    Mercaldo, Francesco
    Nardone, Vittoria
    Santone, Antonella
    Vaglini, Gigliola
    COMPUTER SAFETY, RELIABILITY, AND SECURITY, SAFECOMP 2018, 2018, 11094 : 374 - 386
  • [46] A Comparison of Machine Learning Approaches for Classifying Multiple Sclerosis Courses Using MRSI and Brain Segmentations
    Ion-Margineanu, Adrian
    Kocevar, Gabriel
    Stamile, Claudio
    Sima, Diana M.
    Durand-Dubief, Francoise
    Van Huffel, Sabine
    Sappey-Marinier, Dominique
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, PT II, 2017, 10614 : 643 - 651
  • [47] A Combined Rule-Based & Machine Learning Audio-Visual Emotion Recognition Approach
    Seng, Kah Phooi
    Ang, Li-Minn
    Ooi, Chien Shing
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2018, 9 (01) : 3 - 13
  • [48] A Review of Orientation Estimation Algorithms for AHRS Using Rule-based Filtering and Machine Learning
    Lee C.J.
    Lee J.K.
    Journal of Institute of Control, Robotics and Systems, 2024, 30 (05) : 511 - 523
  • [49] Machine learning with Belief Rule-Based Expert Systems to predict stock price movements
    Hossain, Emam
    Hossain, Mohammad Shahadat
    Zander, Par-Ola
    Andersson, Karl
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 206
  • [50] Negation detection in Dutch clinical texts: an evaluation of rule-based and machine learning methods
    van Es, Bram
    Reteig, Leon C.
    Tan, Sander C.
    Schraagen, Marijn
    Hemker, Myrthe M.
    Arends, Sebastiaan R. S.
    Rios, Miguel A. R.
    Haitjema, Saskia
    BMC BIOINFORMATICS, 2023, 24 (01)