Automated classification of lay health articles using natural language processing: a case study on pregnancy health and postpartum depression

被引:2
|
作者
Patra, Braja Gopal [1 ]
Sun, Zhaoyi [1 ]
Cheng, Zilin [1 ]
Kumar, Praneet Kasi Reddy Jagadeesh [1 ]
Altammami, Abdullah [1 ]
Liu, Yiyang [1 ]
Joly, Rochelle [2 ]
Jedlicka, Caroline [3 ,4 ]
Delgado, Diana [4 ]
Pathak, Jyotishman [1 ]
Peng, Yifan [1 ]
Zhang, Yiye [1 ]
机构
[1] Weill Cornell Med, Dept Populat Hlth Sci, New York, NY 10065 USA
[2] Weill Cornell Med, Dept Obstet & Gynecol, New York, NY USA
[3] CUNY, Kingsborough Community Coll, New York, NY USA
[4] Weill Cornell Med, Samuel J Wood Lib & CV Starr Biomed Informat Ctr, New York, NY USA
来源
FRONTIERS IN PSYCHIATRY | 2023年 / 14卷
关键词
online health information; health communication; natural language processing; pregnancy; postpartum depression; INTERNET; STRESS;
D O I
10.3389/fpsyt.2023.1258887
中图分类号
R749 [精神病学];
学科分类号
100205 ;
摘要
ObjectiveEvidence suggests that high-quality health education and effective communication within the framework of social support hold significant potential in preventing postpartum depression. Yet, developing trustworthy and engaging health education and communication materials requires extensive expertise and substantial resources. In light of this, we propose an innovative approach that involves leveraging natural language processing (NLP) to classify publicly accessible lay articles based on their relevance and subject matter to pregnancy and mental health.Materials and methodsWe manually reviewed online lay articles from credible and medically validated sources to create a gold standard corpus. This manual review process categorized the articles based on their pertinence to pregnancy and related subtopics. To streamline and expand the classification procedure for relevance and topics, we employed advanced NLP models such as Random Forest, Bidirectional Encoder Representations from Transformers (BERT), and Generative Pre-trained Transformer model (gpt-3.5-turbo).ResultsThe gold standard corpus included 392 pregnancy-related articles. Our manual review process categorized the reading materials according to lifestyle factors associated with postpartum depression: diet, exercise, mental health, and health literacy. A BERT-based model performed best (F1 = 0.974) in an end-to-end classification of relevance and topics. In a two-step approach, given articles already classified as pregnancy-related, gpt-3.5-turbo performed best (F1 = 0.972) in classifying the above topics.DiscussionUtilizing NLP, we can guide patients to high-quality lay reading materials as cost-effective, readily available health education and communication sources. This approach allows us to scale the information delivery specifically to individuals, enhancing the relevance and impact of the materials provided.
引用
收藏
页数:7
相关论文
共 50 条
  • [31] Machine Learning and Natural Language Processing to Improve Classification of Atrial Septal Defects in Electronic Health Records
    Guo, Yuting
    Shi, Haoming
    Book, Wendy M.
    Ivey, Lindsey Carrie
    Rodriguez, Fred H.
    Sameni, Reza
    Raskind-Hood, Cheryl
    Robichaux, Chad
    Downing, Karrie F.
    Sarker, Abeed
    BIRTH DEFECTS RESEARCH, 2025, 117 (03):
  • [32] Natural Language Processing (NLP) in Qualitative Public Health Research: A Proof of Concept Study
    Leeson, William
    Resnick, Adam
    Alexander, Daniel
    Rovers, John
    INTERNATIONAL JOURNAL OF QUALITATIVE METHODS, 2019, 18
  • [33] Distributions of recorded pain in mental health records: a natural language processing based study
    Chaturvedi, Jaya
    Stewart, Robert
    Ashworth, Mark
    Roberts, Angus
    BMJ OPEN, 2024, 14 (04):
  • [34] Improving the accuracy of automated gout flare ascertainment using natural language processing of electronic health records and linked Medicare claims data
    Yoshida, Kazuki
    Cai, Tianrun
    Bessette, Lily G.
    Kim, Erin
    Lee, Su Been
    Zabotka, Luke E.
    Sun, Alec
    Mastrorilli, Julianna M.
    Oduol, Theresa A.
    Liu, Jun
    Solomon, Daniel H.
    Kim, Seoyoung C.
    Desai, Rishi J.
    Liao, Katherine P.
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2024, 33 (01)
  • [35] Analysis of depression in social media texts through the Patient Health Questionnaire-9 and natural language processing
    Kim, Nam Hyeok
    Kim, Ji Min
    Park, Da Mi
    Ji, Su Ryeon
    Kim, Jong Woo
    DIGITAL HEALTH, 2022, 8
  • [36] Process ontology development using natural language processing: a multiple case study
    Gurbuz, Ozge
    Rabhi, Fethi
    Demirors, Onur
    BUSINESS PROCESS MANAGEMENT JOURNAL, 2019, 25 (06) : 1208 - 1227
  • [37] Understanding Pregnancy and Postpartum Health Using Ecological Momentary Assessment and Mobile Technology: Protocol for the Postpartum Mothers Mobile Study
    Mendez, Dara D.
    Sanders, Sarah A.
    Karimi, Hassan A.
    Gharani, Pedram
    Rathbun, Stephen L.
    Gary-Webb, Tiffany L.
    Wallace, Meredith L.
    Gianakas, John J.
    Burke, Lora E.
    Davis, Esa M.
    JMIR RESEARCH PROTOCOLS, 2019, 8 (06):
  • [38] Natural Language Processing to Improve Prediction of Incident Atrial Fibrillation Using Electronic Health Records
    Ashburner, Jeffrey M.
    Chang, Yuchiao
    Wang, Xin
    Khurshid, Shaan
    Anderson, Christopher D.
    Dahal, Kumar
    Weisenfeld, Dana
    Cai, Tianrun
    Liao, Katherine P.
    Wagholikar, Kavishwar B.
    Murphy, Shawn N.
    Atlas, Steven J.
    Lubitz, Steven A.
    Singer, Daniel E.
    JOURNAL OF THE AMERICAN HEART ASSOCIATION, 2022, 11 (15):
  • [39] Using natural language processing to identify opioid use disorder in electronic health record data
    Singleton, Jade
    Li, Chengxi
    Akpunonu, Peter D.
    Abner, Erin L.
    Kucharska-Newton, Anna M.
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2023, 170
  • [40] Validation of Case Finding Algorithms for Hepatocellular Cancer From Administrative Data and Electronic Health Records Using Natural Language Processing
    Sada, Yvonne
    Hou, Jason
    Richardson, Peter
    El-Serag, Hashem
    Davila, Jessica
    MEDICAL CARE, 2016, 54 (02) : E9 - E14