Automated classification of lay health articles using natural language processing: a case study on pregnancy health and postpartum depression

被引：2

作者：

Patra, Braja Gopal ^{[1
]}

Sun, Zhaoyi ^{[1
]}

Cheng, Zilin ^{[1
]}

Kumar, Praneet Kasi Reddy Jagadeesh ^{[1
]}

Altammami, Abdullah ^{[1
]}

Liu, Yiyang ^{[1
]}

Joly, Rochelle ^{[2
]}

Jedlicka, Caroline ^{[3
,4
]}

Delgado, Diana ^{[4
]}

Pathak, Jyotishman ^{[1
]}

Peng, Yifan ^{[1
]}

Zhang, Yiye ^{[1
]}

机构：

[1] Weill Cornell Med, Dept Populat Hlth Sci, New York, NY 10065 USA

[2] Weill Cornell Med, Dept Obstet & Gynecol, New York, NY USA

[3] CUNY, Kingsborough Community Coll, New York, NY USA

[4] Weill Cornell Med, Samuel J Wood Lib & CV Starr Biomed Informat Ctr, New York, NY USA

来源：

FRONTIERS IN PSYCHIATRY | 2023年 / 14卷

关键词：

online health information; health communication; natural language processing; pregnancy; postpartum depression; INTERNET; STRESS;

D O I：

10.3389/fpsyt.2023.1258887

中图分类号：

R749 [精神病学];

学科分类号：

100205 ;

摘要：

ObjectiveEvidence suggests that high-quality health education and effective communication within the framework of social support hold significant potential in preventing postpartum depression. Yet, developing trustworthy and engaging health education and communication materials requires extensive expertise and substantial resources. In light of this, we propose an innovative approach that involves leveraging natural language processing (NLP) to classify publicly accessible lay articles based on their relevance and subject matter to pregnancy and mental health.Materials and methodsWe manually reviewed online lay articles from credible and medically validated sources to create a gold standard corpus. This manual review process categorized the articles based on their pertinence to pregnancy and related subtopics. To streamline and expand the classification procedure for relevance and topics, we employed advanced NLP models such as Random Forest, Bidirectional Encoder Representations from Transformers (BERT), and Generative Pre-trained Transformer model (gpt-3.5-turbo).ResultsThe gold standard corpus included 392 pregnancy-related articles. Our manual review process categorized the reading materials according to lifestyle factors associated with postpartum depression: diet, exercise, mental health, and health literacy. A BERT-based model performed best (F1 = 0.974) in an end-to-end classification of relevance and topics. In a two-step approach, given articles already classified as pregnancy-related, gpt-3.5-turbo performed best (F1 = 0.972) in classifying the above topics.DiscussionUtilizing NLP, we can guide patients to high-quality lay reading materials as cost-effective, readily available health education and communication sources. This approach allows us to scale the information delivery specifically to individuals, enhancing the relevance and impact of the materials provided.

引用

页数：7

共 50 条

[1] Automated Classification of NASA Anomalies Using Natural Language Processing Techniques
Falessi, Davide
Layman, Lucas
2013 IEEE INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING WORKSHOPS (ISSREW), 2013, : 5 - 6
[2] Detection of Depression Severity Using Bengali Social Media Posts on Mental Health: Study Using Natural Language Processing Techniques
Kabir, Muhammad Khubayeeb
Islam, Maisha
Kabir, Anika Nahian Binte
Haque, Adiba
Rhaman, Md Khalilur
JMIR FORMATIVE RESEARCH, 2022, 6 (09)
[3] Automated derivation of diagnostic criteria for lung cancer using natural language processing on electronic health records: a pilot study
Houston, Andrew
Williams, Sophie
Ricketts, William
Gutteridge, Charles
Tackaberry, Chris
Conibear, John
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2024, 24 (01)
[4] Incorporating natural language processing to improve classification of axial spondyloarthritis using electronic health records
Zhao, Sizheng Steven
Hong, Chuan
Cai, Tianrun
Xu, Chang
Huang, Jie
Ermann, Joerg
Goodson, Nicola J.
Solomon, Daniel H.
Cai, Tianxi
Liao, Katherine P.
RHEUMATOLOGY, 2020, 59 (05) : 1059 - 1065
[5] A case study of using natural language processing to extract consumer insights from tweets in American cities for public health crises
Ye Wang
Erin Willis
Vijaya K. Yeruva
Duy Ho
Yugyung Lee
BMC Public Health, 23
[6] A case study of using natural language processing to extract consumer insights from tweets in American cities for public health crises
Wang, Ye
Willis, Erin
Yeruva, Vijaya K. K.
Ho, Duy
Lee, Yugyung
BMC PUBLIC HEALTH, 2023, 23 (01)
[7] Automated Genre Classification of Books Using Machine Learning and Natural Language Processing
Gupta, Shikha
Agarwal, Mohit
Jain, Satbir
2019 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2019), 2019, : 269 - 272
[8] A Natural Language Processing System That Links Medical Terms in Electronic Health Record Notes to Lay Definitions: System Development Using Physician Reviews
Chen, Jinying
Druhl, Emily
Ramesh, Balaji Polepalli
Houston, Thomas K.
Brandt, Cynthia A.
Zulman, Donna M.
Vimalananda, Varsha G.
Malkani, Samir
Yu, Hong
JOURNAL OF MEDICAL INTERNET RESEARCH, 2018, 20 (01)
[9] Global Research on Natural Disasters and Human Health: a Mapping Study Using Natural Language Processing Techniques
Ye, Xin
Lin, Hugo
CURRENT ENVIRONMENTAL HEALTH REPORTS, 2024, 11 (01) : 61 - 70
[10] Developing an Automated Registry (Autoregistry) of Spine Surgery Using Natural Language Processing and Health System Scale Databases
Cheung, Alexander T. M.
Kurland, David B.
Neifert, Sean
Mandelberg, Nataniel
Nasir-Moin, Mustafa
Laufer, Ilya
Pacione, Donato
Lau, Darryl
Frempong-Boadu, Anthony K.
Kondziolka, Douglas
Golfinos, John G.
Oermann, Eric Karl
NEUROSURGERY, 2023, 93 (06) : 1228 - 1234

← 1 2 3 4 5 →