Phenotype Discovery and Geographic Disparities of Late-Stage Breast Cancer Diagnosis across US Counties: A Machine Learning Approach

被引:20
作者
Dong, Weichuan [1 ,2 ,3 ,4 ]
Bensken, Wyatt P. [1 ,3 ]
Kim, Uriel [1 ,2 ,3 ]
Rose, Johnie [1 ,2 ,3 ]
Berger, Nathan A. [1 ,5 ]
Koroukian, Siran M. [1 ,2 ,3 ]
机构
[1] Case Western Reserve Univ, Case Comprehens Canc Ctr, Sch Med, Cleveland, OH 44106 USA
[2] Case Western Reserve Univ, Ctr Community Hlth Integrat, Sch Med, Cleveland, OH 44106 USA
[3] Case Western Reserve Univ, Dept Populat & Quantitat Hlth Sci, Cleveland, OH 44106 USA
[4] Kent State Univ, Dept Geog, Kent, OH 44242 USA
[5] Case Western Reserve Univ, Ctr Sci Hlth & Soc, Sch Med, Cleveland, OH 44106 USA
关键词
RURAL-URBAN DISPARITIES; AREA DEPRIVATION; RISK; WOMEN; ASSOCIATION; RESOURCES; ACCESS; RACE;
D O I
10.1158/1055-9965.EPI-21-0838
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Background: Disparities in the stage at diagnosis for breast cancer have been independently associated with various contextual characteristics. Understanding which combinations of these characteristics indicate highest risk, and where they are located, is critical to targeting interventions and improving outcomes for patients with breast cancer. Methods: The study included women diagnosed with invasive breast cancer between 2009 and 2018 from 680 U.S. counties participating in the Surveillance, Epidemiology, and End Results program. We used a machine learning approach called Classification and Regression Tree (CART) to identify county "phenotypes," combinations of characteristics that predict the percentage of patients with breast cancer presenting with late-stage disease. We then mapped the phenotypes and compared their geographic distributions. These findings were further validated using an alternate machine learning approach called random forest. Results: We discovered seven phenotypes of late-stage breast cancer. Common to most phenotypes associated with high risk of late-stage diagnosis were high uninsured rate, low mammography use, high area deprivation, rurality, and high poverty. Geographically, these phenotypes were most prevalent in southern and western states, while phenotypes associated with lower percentages of late-stage diagnosis were most prevalent in the northeastern states and select metropolitan areas. Conclusions: The use of machine learning methods of CART and random forest together with geographic methods offers a promising avenue for future disparities research. Impact: Local interventions to reduce late-stage breast cancer diagnosis, such as community education and outreach programs, can use machine learning and geographic modeling approaches to tailor strategies for early detection and resource allocation.
引用
收藏
页码:66 / 76
页数:11
相关论文
共 58 条
[1]   Breast Cancer Screening, Area Deprivation, and Later-Stage Breast Cancer in Appalachia: Does Geography Matter? [J].
Anderson, Roger T. ;
Yang, Tse-Chang ;
Matthews, Stephen A. ;
Camacho, Fabian ;
Kern, Teresa ;
Mackley, Heath B. ;
Kimmick, Gretchen ;
Louis, Christopher ;
Lengerich, Eugene ;
Yao, Nengliang .
HEALTH SERVICES RESEARCH, 2014, 49 (02) :546-567
[2]  
[Anonymous], 2021, Behavioral risk factor surveillance system: 2020 summary data quality report
[3]  
[Anonymous], 2021, Surveillance, Epidemiology, and End Results (SEER)
[4]   LOCAL INDICATORS OF SPATIAL ASSOCIATION - LISA [J].
ANSELIN, L .
GEOGRAPHICAL ANALYSIS, 1995, 27 (02) :93-115
[5]   Factors predicting birth weight in a low-risk sample: The role of modifiable pregnancy health behaviors [J].
Bailey, Beth A. ;
Byrom, Abbie R. .
MATERNAL AND CHILD HEALTH JOURNAL, 2007, 11 (02) :173-179
[6]   Significance of Increasing Poverty Levels for Determining Late-Stage Breast Cancer Diagnosis in 1990 and 2000 [J].
Barry, Janis ;
Breen, Nancy ;
Barrett, Michael .
JOURNAL OF URBAN HEALTH-BULLETIN OF THE NEW YORK ACADEMY OF MEDICINE, 2012, 89 (04) :614-627
[7]   Relationship between insurance status and outcomes for patients with breast cancer in Missouri [J].
Berrian, Jennifer L. ;
Liu, Ying ;
Lian, Min ;
Schmaltz, Chester L. ;
Colditz, Graham A. .
CANCER, 2021, 127 (06) :931-937
[8]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[9]   Rural-Urban Disparities in Access to Breast Cancer Screening: A Spatial Clustering Analysis [J].
Chandak, Aastha ;
Nayar, Preethy ;
Lin, Ge .
JOURNAL OF RURAL HEALTH, 2019, 35 (02) :229-235
[10]   Social determinants of breast cancer risk, stage, and survival [J].
Coughlin, Steven S. .
BREAST CANCER RESEARCH AND TREATMENT, 2019, 177 (03) :537-548