Developing a Natural Language Processing tool to identify perinatal self-harm in electronic healthcare records

被引:8
作者
Ayre, Karyn [1 ,2 ]
Bittar, Andre [3 ]
Kam, Joyce [4 ]
Verma, Somain [4 ]
Howard, Louise M. [1 ,2 ]
Dutta, Rina [2 ,3 ]
机构
[1] Kings Coll London, Inst Psychiat Psychol & Neurosci, Hlth Serv & Populat Res Dept, Sect Womens Mental Hlth, London, England
[2] Bethlem Royal & Maudsley Hosp, South London & Maudsley NHS Fdn Trust, London, England
[3] Kings Coll London, Inst Psychiat Psychol & Neurosci, Acad Dept Psychol Med, London, England
[4] Kings Coll London, GKT Sch Med Educ, London, England
来源
PLOS ONE | 2021年 / 16卷 / 08期
基金
英国科研创新办公室;
关键词
PREGNANT-WOMEN; SUICIDE; CLASSIFICATION; PREVALENCE; SERVICES;
D O I
10.1371/journal.pone.0253809
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background Self-harm occurring within pregnancy and the postnatal year ("perinatal self-harm") is a clinically important yet under-researched topic. Current research likely under-estimates prevalence due to methodological limitations. Electronic healthcare records (EHRs) provide a source of clinically rich data on perinatal self-harm. Aims (1) To create a Natural Language Processing (NLP) tool that can, with acceptable precision and recall, identify mentions of acts of perinatal self-harm within EHRs. (2) To use this tool to identify service-users who have self-harmed perinatally, based on their EHRs. Methods We used the Clinical Record Interactive Search system to extract de-identified EHRs of secondary mental healthcare service-users at South London and Maudsley NHS Foundation Trust. We developed a tool that applied several layers of linguistic processing based on the spaCy NLP library for Python. We evaluated mention-level performance in the following domains: span, status, temporality and polarity. Evaluation was done against a manually coded reference standard. Mention-level performance was reported as precision, recall, F-score and Cohen's kappa for each domain. Performance was also assessed at 'service-user' level and explored whether a heuristic rule improved this. We report per-class statistics for service-user performance, as well as likelihood ratios and post-test probabilities. Results Mention-level performance: micro-averaged F-score, precision and recall for span, polarity and temporality >0.8. Kappa for status 0.68, temporality 0.62, polarity 0.91. Service-user level performance with heuristic: F-score, precision, recall of minority class 0.69, macro-averaged F-score 0.81, positive LR 9.4 (4.8-19), post-test probability 69.0% (53-82%). Considering the task difficulty, the tool performs well, although temporality was the attribute with the lowest level of annotator agreement. Conclusions It is feasible to develop an NLP tool that identifies, with acceptable validity, mentions of perinatal self-harm within EHRs, although with limitations regarding temporality. Using a heuristic rule, it can also function at a service-user-level.
引用
收藏
页数:13
相关论文
共 44 条
[1]   Monitoring Suicidal Patients in Primary Care Using Electronic Health Records [J].
Anderson, Heather D. ;
Pace, Wilson D. ;
Brandt, Elias ;
Nielsen, Rodney D. ;
Allen, Richard R. ;
Libby, Anne M. ;
West, David R. ;
Valuck, Robert J. .
JOURNAL OF THE AMERICAN BOARD OF FAMILY MEDICINE, 2015, 28 (01) :65-71
[2]  
[Anonymous], Hospital Episode Statistics
[3]  
[Anonymous], 2013, SELF HARM QUAL STAND
[4]   The Prevalence and Correlates of Self Harm in the Perinatal Period: A Systematic Review [J].
Ayre, Karyn ;
Gordon, Hannah G. ;
Dutta, Rina ;
Hodsoll, John ;
Howard, Louise M. .
JOURNAL OF CLINICAL PSYCHIATRY, 2020, 81 (01)
[5]  
Bethard Steven, 2017, P 11 INT WORKSH SEM, P565, DOI 10.18653/v1/S17-2093
[6]   Text Classification to Inform Suicide Risk Assessment in Electronic Health Records [J].
Bittar, Andre ;
Velupillai, Sumithra ;
Roberts, Angus ;
Dutta, Rina .
MEDINFO 2019: HEALTH AND WELLBEING E-NETWORKS FOR ALL, 2019, 264 :40-44
[7]   Self-Harm Among Adult Victims of Human Trafficking Who Accessed Secondary Mental Health Services in England [J].
Borschmann, Rohan ;
Oram, Sian ;
Kinner, Stuart A. ;
Dutta, Rina ;
Zimmerman, Cathy ;
Howard, Louise M. .
PSYCHIATRIC SERVICES, 2017, 68 (02) :207-210
[8]   Identification of suicidal behavior among psychiatrically hospitalized adolescents using natural language processing and machine learning of electronic health records [J].
Carson, Nicholas J. ;
Mullin, Brian ;
Sanchez, Maria Jose ;
Lu, Frederick ;
Yang, Kelly ;
Menezes, Michelle ;
Le Cook, Benjamin .
PLOS ONE, 2019, 14 (02)
[9]   A COEFFICIENT OF AGREEMENT FOR NOMINAL SCALES [J].
COHEN, J .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1960, 20 (01) :37-46
[10]   Why Cohen's Kappa should be avoided as performance measure in classification [J].
Delgado, Rosario ;
Tibau, Xavier-Andoni .
PLOS ONE, 2019, 14 (09)