Automatic classification of mammography reports by BI-RADS breast tissue composition class

被引:44
作者
Percha, Bethany
Nassif, Houssam [2 ,3 ]
Lipson, Jafi [4 ]
Burnside, Elizabeth [2 ,5 ]
Rubin, Daniel [1 ,4 ]
机构
[1] Stanford Univ, Richard M Lucas Ctr P285, Biomed Informat Program, Stanford, CA 94305 USA
[2] Univ Wisconsin, Dept Biostat & Med Informat, Madison, WI USA
[3] Univ Wisconsin, Dept Comp Sci, Madison, WI 53706 USA
[4] Stanford Univ, Dept Radiol, Stanford, CA 94305 USA
[5] Univ Wisconsin, Dept Radiol, Madison, WI 53706 USA
基金
美国国家卫生研究院;
关键词
REPLACEMENT THERAPY USE; RISK-FACTOR; FAMILY-HISTORY; DENSITY; CANCER; SYSTEM; WOMEN;
D O I
10.1136/amiajnl-2011-000607
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Because breast tissue composition partially predicts breast cancer risk, classification of mammography reports by breast tissue composition is important from both a scientific and clinical perspective. A method is presented for using the unstructured text of mammography reports to classify them into BI-RADS breast tissue composition categories. An algorithm that uses regular expressions to automatically determine BI-RADS breast tissue composition classes for unstructured mammography reports was developed. The algorithm assigns each report to a single BI-RADS composition class: 'fatty', 'fibroglandular', 'heterogeneously dense', 'dense', or 'unspecified'. We evaluated its performance on mammography reports from two different institutions. The method achieves >99% classification accuracy on a test set of reports from the Marshfield Clinic (Wisconsin) and Stanford University. Since large-scale studies of breast cancer rely heavily on breast tissue composition information, this method could facilitate this research by helping mine large datasets to correlate breast composition with other covariates.
引用
收藏
页码:913 / 916
页数:4
相关论文
共 30 条
[21]   Mammographic density and the risk of breast cancer in Japanese women [J].
Nagata, C ;
Matsubara, T ;
Fujita, H ;
Nagao, Y ;
Shibuya, C ;
Kashiki, Y ;
Shimizu, H .
BRITISH JOURNAL OF CANCER, 2005, 92 (12) :2102-2106
[22]   Information Extraction for Clinical Data Mining: A Mammography Case Study [J].
Nassif, Houssam ;
Woods, Ryan ;
Burnside, Elizabeth ;
Ayvaci, Mehmet ;
Shavlik, Jude ;
Page, David .
2009 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2009), 2009, :37-+
[23]   Uncovering and Improving Upon the Inherent Deficiencies of Radiology Reporting through Data Mining [J].
Reiner, Bruce .
JOURNAL OF DIGITAL IMAGING, 2010, 23 (02) :109-118
[24]  
Sevenster M, 2011, J DIGITAL IMAGING
[25]  
Solti I, 2009, P IEEE INT C BIOINF, P314
[26]  
Starren J, 1997, J AM MED INFORM ASSN, P655
[27]  
Starren J, 1996, Proc AMIA Annu Fall Symp, P557
[28]   Mammographic density and candidate gene variants: A twins and sisters study [J].
Stone, Jennifer ;
Gurrin, Lyle C. ;
Byrnes, Graham B. ;
Schroen, Christopher J. ;
Treloar, Susan A. ;
Padilla, Emma J. D. ;
Dite, Gillian S. ;
Southey, Melissa C. ;
Hayes, Vanessa M. ;
Hopper, John L. .
CANCER EPIDEMIOLOGY BIOMARKERS & PREVENTION, 2007, 16 (07) :1479-1484
[29]  
Ursin G, 2003, CANCER EPIDEM BIOMAR, V12, P332
[30]   Mammographic breast density and family history of breast cancer [J].
Ziv, E ;
Shepherd, J ;
Smith-Bindinan, R ;
Kerlikowske, K .
JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2003, 95 (07) :556-558