A comparison of rule-based and machine learning approaches for classifying patient portal messages

被引:42
|
作者
Cronin, Robert M. [1 ,2 ,3 ]
Fabbri, Daniel [1 ,4 ]
Denny, Joshua C. [1 ,2 ]
Rosenbloom, S. Trent [1 ,2 ,3 ]
Jackson, Gretchen Purcell [1 ,3 ,5 ]
机构
[1] Vanderbilt Univ, Med Ctr, Dept Biomed Informat, 2525 West End Blvd,Suite 1475, Nashville, TN 37232 USA
[2] Vanderbilt Univ, Med Ctr, Dept Med, Nashville, TN 37232 USA
[3] Vanderbilt Univ, Med Ctr, Dept Pediat, Nashville, TN 37232 USA
[4] Vanderbilt Univ, Dept Comp Sci, Nashville, TN 37232 USA
[5] Vanderbilt Univ, Med Ctr, Dept Pediat Surg, Nashville, TN 37232 USA
关键词
Patient portal; Text classification; Natural language processing; Machine learning; TEXT CLASSIFICATION; SECURE MESSAGES; PRIMARY-CARE; HEALTH-CARE; INTERVENTIONS; COMMUNICATION; INFORMATION; COMPUTER;
D O I
10.1016/j.ijmedinf.2017.06.004
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: Secure messaging through patient portals is an increasingly popular way that consumers interact with healthcare providers. The increasing burden of secure messaging can affect clinic staffing and workflows. Manual management of portal messages is costly and time consuming. Automated classification of portal messages could potentially expedite message triage and delivery of care. Materials and methods: We developed automated patient portal message classifiers with rule-based and machine learning techniques using bag of words and natural language processing (NLP) approaches. To evaluate classifier performance, we used a gold standard of 3253 portal messages manually categorized using a taxonomy of communication types (i.e., main categories of informational, medical, logistical, social, and other communications, and subcategories including prescriptions, appointments, problems, tests, follow-up, contact information, and acknowledgement). We evaluated our classifiers' accuracies in identifying individual communication types within portal messages with area under the receiver-operator curve (AUC). Portal messages often contain more than one type of communication. To predict all communication types within single messages, we used the Jaccard Index. We extracted the variables of importance for the random forest classifiers. Results: The best performing approaches to classification for the major communication types were: logistic regression for medical communications (AUC: 0.899); basic (rule-based) for informational communications (AUC: 0.842); and random forests for social communications and logistical communications (AUCs: 0.875 and 0.925, respectively). The best performing classification approach of classifiers for individual communication subtypes was random forests for Logistical-Contact Information (AUC: 0.963). The Jaccard Indices by approach were: basic classifier, Jaccard Index: 0.674; Naive Bayes, Jaccard Index: 0.799; random forests, Jaccard Index: 0.859; and logistic regression, Jaccard Index: 0.861. For medical communications, the most predictive variables were NLP concepts (e.g., Temporal Concept, which maps to 'morning', 'evening' and Idea or Concept which maps to 'appointment' and 'refill'). For logistical communications, the most predictive variables contained similar numbers of NLP variables and words (e.g., Telephone mapping to 'phone', 'insurance'). For social and informational communications, the most predictive variables were words (e.g., social: 'thanks', 'much', informational: 'question', 'mean'). Conclusions: This study applies automated classification methods to the content of patient portal messages and evaluates the application of NLP techniques on consumer communications in patient portal messages. We demonstrated that random forest and logistic regression approaches accurately classified the content of portal messages, although the best approach to classification varied by communication type. Words were the most predictive variables for classification of most communication types, although NLP variables were most predictive for medical communication types. As adoption of patient portals increases, automated techniques could assist in understanding and managing growing volumes of messages. Further work is needed to improve classification performance to potentially support message triage and answering.
引用
收藏
页码:110 / 120
页数:11
相关论文
共 50 条
  • [1] Comparison of Machine Learning and Rule-based Approaches for an Optical Fall Detection System
    Rothmeier, Tobias
    Kunze, Stefan
    2022 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND VIRTUAL ENVIRONMENTS FOR MEASUREMENT SYSTEMS AND APPLICATIONS (IEEE CIVEMSA 2022), 2022,
  • [2] Artificial intelligence to organize patient portal messages: a journey from an ensemble deep learning text classification to rule-based named entity recognition
    Tafti, Ahmad P.
    Fu, Sunyang
    Khurana, Aditya
    Mastorakos, George M.
    Poole, Kenneth G.
    Traub, Stephen J.
    Yiannias, James A.
    Liu, Hongfang
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 1380 - 1387
  • [3] Rule-based and Machine Learning Hybrid System for Patient Cohort Selection
    Antunes, Rui
    Silva, Joao Figueira
    Pereira, Arnaldo
    Matos, Sergio
    HEALTHINF: PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES - VOL 5: HEALTHINF, 2019, : 59 - 67
  • [4] Automatic De-Identification of French Clinical Records: Comparison of Rule-Based and Machine-Learning Approaches
    Grouin, Cyril
    Zweigenbaum, Pierre
    MEDINFO 2013: PROCEEDINGS OF THE 14TH WORLD CONGRESS ON MEDICAL AND HEALTH INFORMATICS, PTS 1 AND 2, 2013, 192 : 476 - 480
  • [5] Combination of Heuristic, Rule-Based and Machine Learning for Bibliography Extraction
    Suryawati, Endang
    Widyantoro, Dwi H.
    PROCEEDINGS OF 2017 5TH INTERNATIONAL CONFERENCE ON INSTRUMENTATION, COMMUNICATIONS, INFORMATION TECHNOLOGY, AND BIOMEDICAL ENGINEERING (ICICI-BME): SCIENCE AND TECHNOLOGY FOR A BETTER LIFE, 2017, : 276 - 281
  • [6] Motion Evaluation of Therapy Exercises by Means of Skeleton Normalisation, Incremental Dynamic Time Warping and Machine Learning: A Comparison of a Rule-Based and a Machine-Learning-Based Approach
    Richter, Julia
    Wiede, Christian
    Heinkel, Ulrich
    Hirtz, Gangolf
    VISAPP: PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 4, 2019, : 497 - 504
  • [7] Machine learning for prediction of muscle activations for a rule-based controller
    Jonic, S
    Popovic, D
    PROCEEDINGS OF THE 19TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOL 19, PTS 1-6: MAGNIFICENT MILESTONES AND EMERGING OPPORTUNITIES IN MEDICAL ENGINEERING, 1997, 19 : 1781 - 1784
  • [8] Mining fall-related information in clinical notes: Comparison of rule-based and novel word embedding-based machine learning approaches
    Topaz, Maxim
    Murga, Ludmila
    Gaddis, Katherine M.
    McDonald, Margaret V.
    Bar-Bachar, Ofrit
    Goldberg, Yoav
    Bowles, Kathryn H.
    JOURNAL OF BIOMEDICAL INFORMATICS, 2019, 90
  • [9] A review of possible effects of cognitive biases on interpretation of rule-based machine learning models
    Kliegr, Tomas
    Bahnik, Stepan
    Fuernkranz, Johannes
    ARTIFICIAL INTELLIGENCE, 2021, 295
  • [10] Implementation approaches and barriers for rule-based and machine learning-based sepsis risk prediction tools: a qualitative study
    Joshi, Mugdha
    Mecklai, Keizra
    Rozenblum, Ronen
    Samal, Lipika
    JAMIA OPEN, 2022, 5 (02)