Automated classification of lay health articles using natural language processing: a case study on pregnancy health and postpartum depression

被引:2
|
作者
Patra, Braja Gopal [1 ]
Sun, Zhaoyi [1 ]
Cheng, Zilin [1 ]
Kumar, Praneet Kasi Reddy Jagadeesh [1 ]
Altammami, Abdullah [1 ]
Liu, Yiyang [1 ]
Joly, Rochelle [2 ]
Jedlicka, Caroline [3 ,4 ]
Delgado, Diana [4 ]
Pathak, Jyotishman [1 ]
Peng, Yifan [1 ]
Zhang, Yiye [1 ]
机构
[1] Weill Cornell Med, Dept Populat Hlth Sci, New York, NY 10065 USA
[2] Weill Cornell Med, Dept Obstet & Gynecol, New York, NY USA
[3] CUNY, Kingsborough Community Coll, New York, NY USA
[4] Weill Cornell Med, Samuel J Wood Lib & CV Starr Biomed Informat Ctr, New York, NY USA
来源
FRONTIERS IN PSYCHIATRY | 2023年 / 14卷
关键词
online health information; health communication; natural language processing; pregnancy; postpartum depression; INTERNET; STRESS;
D O I
10.3389/fpsyt.2023.1258887
中图分类号
R749 [精神病学];
学科分类号
100205 ;
摘要
ObjectiveEvidence suggests that high-quality health education and effective communication within the framework of social support hold significant potential in preventing postpartum depression. Yet, developing trustworthy and engaging health education and communication materials requires extensive expertise and substantial resources. In light of this, we propose an innovative approach that involves leveraging natural language processing (NLP) to classify publicly accessible lay articles based on their relevance and subject matter to pregnancy and mental health.Materials and methodsWe manually reviewed online lay articles from credible and medically validated sources to create a gold standard corpus. This manual review process categorized the articles based on their pertinence to pregnancy and related subtopics. To streamline and expand the classification procedure for relevance and topics, we employed advanced NLP models such as Random Forest, Bidirectional Encoder Representations from Transformers (BERT), and Generative Pre-trained Transformer model (gpt-3.5-turbo).ResultsThe gold standard corpus included 392 pregnancy-related articles. Our manual review process categorized the reading materials according to lifestyle factors associated with postpartum depression: diet, exercise, mental health, and health literacy. A BERT-based model performed best (F1 = 0.974) in an end-to-end classification of relevance and topics. In a two-step approach, given articles already classified as pregnancy-related, gpt-3.5-turbo performed best (F1 = 0.972) in classifying the above topics.DiscussionUtilizing NLP, we can guide patients to high-quality lay reading materials as cost-effective, readily available health education and communication sources. This approach allows us to scale the information delivery specifically to individuals, enhancing the relevance and impact of the materials provided.
引用
收藏
页数:7
相关论文
共 50 条
  • [41] Using Natural Language Processing on Electronic Health Records to Enhance Detection and Prediction of Psychosis Risk
    Irving, Jessica
    Patel, Rashmi
    Oliver, Dominic
    Colling, Craig
    Pritchard, Megan
    Broadbent, Matthew
    Baldwin, Helen
    Stahl, Daniel
    Stewart, Robert
    Fusar-Poli, Paolo
    SCHIZOPHRENIA BULLETIN, 2021, 47 (02) : 405 - 414
  • [42] Using Natural Language Processing and Machine Learning to Identify Opioids in Electronic Health Record Data
    McDermott, Sean P.
    Wasan, Ajay D.
    JOURNAL OF PAIN RESEARCH, 2023, 16 : 2133 - 2140
  • [43] Identification of recurrent atrial fibrillation using natural language processing applied to electronic health records
    Zheng, Chengyi
    Lee, Ming-sum
    Bansal, Nisha
    Go, Alan S.
    Chen, Cheng
    Harrison, Teresa N.
    Fan, Dongjie
    Allen, Amanda
    Garcia, Elisha
    Lidgard, Ben
    Singer, Daniel
    An, Jaejin
    EUROPEAN HEART JOURNAL-QUALITY OF CARE AND CLINICAL OUTCOMES, 2024, 10 (01) : 77 - 88
  • [44] Prediction of severe chest injury using natural language processing from the electronic health record
    Kulshrestha, Sujay
    Dligach, Dmitriy
    Joyce, Cara
    Baker, Marshall S.
    Gonzalez, Richard
    O'Rourke, Ann P.
    Glazer, Joshua M.
    Stey, Anne
    Kruser, Jacqueline M.
    Churpek, Matthew M.
    Afshar, Majid
    INJURY-INTERNATIONAL JOURNAL OF THE CARE OF THE INJURED, 2021, 52 (02): : 205 - 212
  • [45] Automated Customer Complaint Processing for Water Utilities Based on Natural Language Processing-Case Study of a Dutch Water Utility
    Tian, Xin
    Vertommen, Ina
    Tsiami, Lydia
    van Thienen, Peter
    Paraskevopoulos, Sotirios
    WATER, 2022, 14 (04)
  • [46] Automated risk assessment of newly detected atrial fibrillation poststroke from electronic health record data using machine learning and natural language processing
    Sung, Sheng-Feng
    Sung, Kuan-Lin
    Pan, Ru-Chiou
    Lee, Pei-Ju
    Hu, Ya-Han
    FRONTIERS IN CARDIOVASCULAR MEDICINE, 2022, 9
  • [47] Evidence-based clinical engineering: Health information technology adverse events identification and classification with natural language processing
    Luschi, Alessio
    Nesi, Paolo
    Iadanza, Ernesto
    HELIYON, 2023, 9 (11)
  • [48] The ENACT network is acting on housing instability and the unhoused using the open health natural language processing toolkit
    Harris, Daniel R.
    Fu, Sunyang
    Wen, Andrew
    Corbeau, Alexandria
    Henderson, Darren
    Hilsman, Jordan
    Oniani, David
    Wang, Yanshan
    JOURNAL OF CLINICAL AND TRANSLATIONAL SCIENCE, 2024, 8 (01)
  • [49] Reconciling Allergy Information in the Electronic Health Record After a Drug Challenge Using Natural Language Processing
    Lo, Ying-Chih
    Varghese, Sheril
    Blackley, Suzanne
    Seger, Diane L. L.
    Blumenthal, Kimberly G. G.
    Goss, Foster R. R.
    Zhou, Li
    FRONTIERS IN ALLERGY, 2022, 3
  • [50] Natural language processing (NLP) tools in extracting biomedical concepts from research articles: a case study on autism spectrum disorder
    Peng, Jacqueline
    Zhao, Mengge
    Havrilla, James
    Liu, Cong
    Weng, Chunhua
    Guthrie, Whitney
    Schultz, Robert
    Wang, Kai
    Zhou, Yunyun
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2020, 20 (Suppl 11)