Automated classification of lay health articles using natural language processing: a case study on pregnancy health and postpartum depression

被引:2
|
作者
Patra, Braja Gopal [1 ]
Sun, Zhaoyi [1 ]
Cheng, Zilin [1 ]
Kumar, Praneet Kasi Reddy Jagadeesh [1 ]
Altammami, Abdullah [1 ]
Liu, Yiyang [1 ]
Joly, Rochelle [2 ]
Jedlicka, Caroline [3 ,4 ]
Delgado, Diana [4 ]
Pathak, Jyotishman [1 ]
Peng, Yifan [1 ]
Zhang, Yiye [1 ]
机构
[1] Weill Cornell Med, Dept Populat Hlth Sci, New York, NY 10065 USA
[2] Weill Cornell Med, Dept Obstet & Gynecol, New York, NY USA
[3] CUNY, Kingsborough Community Coll, New York, NY USA
[4] Weill Cornell Med, Samuel J Wood Lib & CV Starr Biomed Informat Ctr, New York, NY USA
来源
FRONTIERS IN PSYCHIATRY | 2023年 / 14卷
关键词
online health information; health communication; natural language processing; pregnancy; postpartum depression; INTERNET; STRESS;
D O I
10.3389/fpsyt.2023.1258887
中图分类号
R749 [精神病学];
学科分类号
100205 ;
摘要
ObjectiveEvidence suggests that high-quality health education and effective communication within the framework of social support hold significant potential in preventing postpartum depression. Yet, developing trustworthy and engaging health education and communication materials requires extensive expertise and substantial resources. In light of this, we propose an innovative approach that involves leveraging natural language processing (NLP) to classify publicly accessible lay articles based on their relevance and subject matter to pregnancy and mental health.Materials and methodsWe manually reviewed online lay articles from credible and medically validated sources to create a gold standard corpus. This manual review process categorized the articles based on their pertinence to pregnancy and related subtopics. To streamline and expand the classification procedure for relevance and topics, we employed advanced NLP models such as Random Forest, Bidirectional Encoder Representations from Transformers (BERT), and Generative Pre-trained Transformer model (gpt-3.5-turbo).ResultsThe gold standard corpus included 392 pregnancy-related articles. Our manual review process categorized the reading materials according to lifestyle factors associated with postpartum depression: diet, exercise, mental health, and health literacy. A BERT-based model performed best (F1 = 0.974) in an end-to-end classification of relevance and topics. In a two-step approach, given articles already classified as pregnancy-related, gpt-3.5-turbo performed best (F1 = 0.972) in classifying the above topics.DiscussionUtilizing NLP, we can guide patients to high-quality lay reading materials as cost-effective, readily available health education and communication sources. This approach allows us to scale the information delivery specifically to individuals, enhancing the relevance and impact of the materials provided.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Automated Classification of NASA Anomalies Using Natural Language Processing Techniques
    Falessi, Davide
    Layman, Lucas
    2013 IEEE INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING WORKSHOPS (ISSREW), 2013, : 5 - 6
  • [2] Detection of Depression Severity Using Bengali Social Media Posts on Mental Health: Study Using Natural Language Processing Techniques
    Kabir, Muhammad Khubayeeb
    Islam, Maisha
    Kabir, Anika Nahian Binte
    Haque, Adiba
    Rhaman, Md Khalilur
    JMIR FORMATIVE RESEARCH, 2022, 6 (09)
  • [3] Automated derivation of diagnostic criteria for lung cancer using natural language processing on electronic health records: a pilot study
    Houston, Andrew
    Williams, Sophie
    Ricketts, William
    Gutteridge, Charles
    Tackaberry, Chris
    Conibear, John
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2024, 24 (01)
  • [4] Incorporating natural language processing to improve classification of axial spondyloarthritis using electronic health records
    Zhao, Sizheng Steven
    Hong, Chuan
    Cai, Tianrun
    Xu, Chang
    Huang, Jie
    Ermann, Joerg
    Goodson, Nicola J.
    Solomon, Daniel H.
    Cai, Tianxi
    Liao, Katherine P.
    RHEUMATOLOGY, 2020, 59 (05) : 1059 - 1065
  • [5] A case study of using natural language processing to extract consumer insights from tweets in American cities for public health crises
    Ye Wang
    Erin Willis
    Vijaya K. Yeruva
    Duy Ho
    Yugyung Lee
    BMC Public Health, 23
  • [6] A case study of using natural language processing to extract consumer insights from tweets in American cities for public health crises
    Wang, Ye
    Willis, Erin
    Yeruva, Vijaya K. K.
    Ho, Duy
    Lee, Yugyung
    BMC PUBLIC HEALTH, 2023, 23 (01)
  • [7] Automated Genre Classification of Books Using Machine Learning and Natural Language Processing
    Gupta, Shikha
    Agarwal, Mohit
    Jain, Satbir
    2019 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2019), 2019, : 269 - 272
  • [8] A Natural Language Processing System That Links Medical Terms in Electronic Health Record Notes to Lay Definitions: System Development Using Physician Reviews
    Chen, Jinying
    Druhl, Emily
    Ramesh, Balaji Polepalli
    Houston, Thomas K.
    Brandt, Cynthia A.
    Zulman, Donna M.
    Vimalananda, Varsha G.
    Malkani, Samir
    Yu, Hong
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2018, 20 (01)
  • [9] Global Research on Natural Disasters and Human Health: a Mapping Study Using Natural Language Processing Techniques
    Ye, Xin
    Lin, Hugo
    CURRENT ENVIRONMENTAL HEALTH REPORTS, 2024, 11 (01) : 61 - 70
  • [10] Developing an Automated Registry (Autoregistry) of Spine Surgery Using Natural Language Processing and Health System Scale Databases
    Cheung, Alexander T. M.
    Kurland, David B.
    Neifert, Sean
    Mandelberg, Nataniel
    Nasir-Moin, Mustafa
    Laufer, Ilya
    Pacione, Donato
    Lau, Darryl
    Frempong-Boadu, Anthony K.
    Kondziolka, Douglas
    Golfinos, John G.
    Oermann, Eric Karl
    NEUROSURGERY, 2023, 93 (06) : 1228 - 1234