Patterns of Metastatic Disease in Patients with Cancer Derived from Natural Language Processing of Structured CT Radiology Reports over a 10-year Period

被引:26
作者
Do, Richard K. G. [1 ]
Lupton, Kaelan [5 ]
Andrieu, Pamela I. Causa [1 ]
Luthra, Anisha [2 ]
Taya, Michio [1 ]
Batch, Karen [5 ]
Nguyen, Huy [3 ]
Rahurkar, Prachi [3 ]
Gazit, Lior [3 ]
Nicholas, Kevin [3 ]
Fong, Christopher J. [4 ]
Gangai, Natalie [1 ]
Schultz, Nikolaus [4 ]
Zulkernine, Farhana [5 ]
Sevilimedu, Varadan [4 ]
Juluru, Krishna [1 ]
Simpson, Amber [5 ]
Hricak, Hedvig [1 ]
机构
[1] Mem Sloan Kettering Canc Ctr, Dept Radiol, 1275 York Ave, New York, NY 10065 USA
[2] Mem Sloan Kettering Canc Ctr, Human Pathol & Pathogenesis Program, Ctr Mol Oncol, 1275 York Ave, New York, NY 10065 USA
[3] Mem Sloan Kettering Canc Ctr, Dept Strategy & Innovat, 1275 York Ave, New York, NY 10065 USA
[4] Mem Sloan Kettering Canc Ctr, Biostat Serv, Dept Epidemiol & Biostat, 1275 York Ave, New York, NY 10065 USA
[5] Queens Univ, Sch Comp, Kingston, ON, Canada
基金
美国国家卫生研究院;
关键词
D O I
10.1148/radiol.2021210043
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Background: Patterns of metastasis in cancer are increasingly relevant to prognostication and treatment planning but have historically been documented by means of autopsy series. Purpose: To show the feasibility of using natural language processing (NLP) to gather accurate data from radiology reports for assessing spatial and temporal patterns of metastatic spread in a large patient cohort. Materials and Methods: In this retrospective longitudinal study, consecutive patients who underwent CT from July 2009 to April 2019 and whose CT reports followed a departmental structured template were included. Three radiologists manually curated a sample of 2219 reports for the presence or absence of metastases across 13 organs; these manually curated reports were used to develop three NLP models with an 80%-20% split for training and test sets. A separate random sample of 448 manually curated reports was used for validation. Model performance was measured by accuracy, precision, and recall for each organ. The best-performing NLP model was used to generate a final database of metastatic disease across all patients. For each cancer type, statistical descriptive reports were provided by analyzing the frequencies of metastatic disease at the report and patient levels. Results: In 91 665 patients (mean age 6 standard deviation, 61 years 6 15; 46 939 women), 387 359 reports were labeled. The best-performing NLP model achieved accuracies from 90% to 99% across all organs. Metastases were most frequently reported in abdominopelvic (23.6% of all reports) and thoracic (17.6%) nodes, followed by lungs (14.7%), liver (13.7%), and bones (9.9%). Metastatic disease tropism is distinct among common cancers, with the most common first site being bones in prostate and breast cancers and liver among pancreatic and colorectal cancers. Conclusion: Natural language processing may be applied to cancer patients' CT reports to generate a large database of metastatic phenotypes. Such a database could be combined with genomic studies and used to explore prognostic imaging phenotypes with relevance to treatment planning. (C) RSNA, 2021
引用
收藏
页码:115 / 122
页数:8
相关论文
共 16 条
[1]   Metastatic patterns of prostate cancer:: An autopsy study of 1,589 patients [J].
Bubendorf, L ;
Schöpfer, A ;
Wagner, U ;
Sauter, G ;
Moch, H ;
Willi, N ;
Gasser, TC ;
Mihatsch, MJ .
HUMAN PATHOLOGY, 2000, 31 (05) :578-583
[2]   The landscape of metastatic progression patterns across major human cancers [J].
Budczies, Jan ;
von Winterfeld, Moritz ;
Klauschen, Frederick ;
Bockmayr, Michael ;
Lennerz, Jochen K. ;
Denkert, Carsten ;
Wolf, Thomas ;
Warth, Arne ;
Dietel, Manfred ;
Anagnostopoulos, Ioannis ;
Weichert, Wilko ;
Wittschieber, Daniel ;
Stenzinger, Albrecht .
ONCOTARGET, 2015, 6 (01) :570-583
[3]   Natural Language Processing Technologies in Radiology Research and Clinical Applications [J].
Cai, Tianrun ;
Giannopoulos, Andreas A. ;
Yu, Sheng ;
Kelil, Tatiana ;
Ripley, Beth ;
Kumamaru, Kanako K. ;
Rybicki, Frank J. ;
Mitsouras, Dimitrios .
RADIOGRAPHICS, 2016, 36 (01) :176-191
[4]  
Disibio G, 2008, ARCH PATHOL LAB MED, V132, P931, DOI 10.1043/1543-2165(2008)132[931:MPOCRF]2.0.CO
[5]  
2
[6]   Metastasis Organotropism: Redefining the Congenial Soil [J].
Gao, Yang ;
Bado, Igor ;
Wang, Hai ;
Zhang, Weijie ;
Rosen, Jeffrey M. ;
Zhang, Xiang H-F .
DEVELOPMENTAL CELL, 2019, 49 (03) :375-391
[7]   Natural language processing for automated quantification of bone metastases reported in free-text bone scintigraphy reports [J].
Groot, Olivier Q. ;
Bongers, Michiel E. R. ;
Karhade, Aditya V. ;
Kapoor, Neal D. ;
Fenn, Brian P. ;
Kim, Jason ;
Verlaan, J. J. ;
Schwab, Joseph H. .
ACTA ONCOLOGICA, 2020, 59 (12) :1455-1460
[8]   Cancer metastasis:: Building a framework [J].
Gupta, Gaorav P. ;
Massague, Joan .
CELL, 2006, 127 (04) :679-695
[9]   Assessment of Deep Natural Language Processing in Ascertaining Oncologic Outcomes From Radiology Reports [J].
Kehl, Kenneth L. ;
Elmarakeby, Haitham ;
Nishino, Mizuki ;
Van Allen, Eliezer M. ;
Lepisto, Eva M. ;
Hassett, Michael J. ;
Johnson, Bruce E. ;
Schrag, Deborah .
JAMA ONCOLOGY, 2019, 5 (10) :1421-1429
[10]   68Ga-PSMA PET/CT in prostate cancer patients - patterns of disease, benign findings and pitfalls [J].
Keidar, Zohar ;
Gill, Ronit ;
Goshen, Elinor ;
Israel, Ora ;
Davidson, Tima ;
Morgulis, Maryna ;
Pirmisashvili, Natalia ;
Ben-Haim, Simona .
CANCER IMAGING, 2018, 18