Unsupervised machine learning for identifying important visual features through bag-of-words using histopathology data from chronic kidney disease

被引:21
作者
Lee, Joonsang [1 ]
Warner, Elisa [1 ]
Shaikhouni, Salma [3 ]
Bitzer, Markus [3 ]
Kretzler, Matthias [3 ]
Gipson, Debbie [4 ]
Pennathur, Subramaniam [3 ]
Bellovich, Keith [5 ]
Bhat, Zeenat [6 ]
Gadegbeku, Crystal [7 ]
Massengill, Susan [8 ]
Perumal, Kalyani [9 ]
Saha, Jharna [2 ]
Yang, Yingbao [2 ]
Luo, Jinghui [2 ]
Zhang, Xin [1 ]
Mariani, Laura [3 ]
Hodgin, Jeffrey B. [2 ]
Rao, Arvind [1 ,10 ,11 ,12 ]
机构
[1] Univ Michigan, Dept Computat Med & Bioinformat, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Dept Pathol, Ann Arbor, MI 48109 USA
[3] Univ Michigan, Dept Internal Med Nephrol, Ann Arbor, MI 48109 USA
[4] Univ Michigan, Dept Pediat, Pediat Nephrol, Ann Arbor, MI 48109 USA
[5] St Clair Nephrol Res, Dept Internal Med, Nephrol, Detroit, MI USA
[6] Wayne State Univ, Dept Internal Med Nephrol, Detroit, MI USA
[7] Cleveland Clin, Dept Internal Med Nephrol, Cleveland, OH 44106 USA
[8] Levine Childrens Hosp, Dept Pediat, Pediat Nephrol, Charlotte, NC USA
[9] Dept JH Stroger Hosp, Dept Internal Med, Nephrol, Chicago, IL USA
[10] Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USA
[11] Univ Michigan, Dept Radiat Oncol, Ann Arbor, MI 48109 USA
[12] Univ Michigan, Dept Biomed Engn, Ann Arbor, MI 48109 USA
关键词
DIGITAL PATHOLOGY; IMAGE; SEGMENTATION; PROGRESSION; EQUATION; GFR;
D O I
10.1038/s41598-022-08974-8
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Pathologists use visual classification to assess patient kidney biopsy samples when diagnosing the underlying cause of kidney disease. However, the assessment is qualitative, or semi-quantitative at best, and reproducibility is challenging. To discover previously unknown features which predict patient outcomes and overcome substantial interobserver variability, we developed an unsupervised bag-of-words model. Our study applied to the C-PROBE cohort of patients with chronic kidney disease (CKD). 107,471 histopathology images were obtained from 161 biopsy cores and identified important morphological features in biopsy tissue that are highly predictive of the presence of CKD both at the time of biopsy and in one year. To evaluate the performance of our model, we estimated the AUC and its 95% confidence interval. We show that this method is reliable and reproducible and can achieve 0.93 AUC at predicting glomerular filtration rate at the time of biopsy as well as predicting a loss of function at one year. Additionally, with this method, we ranked the identified morphological features according to their importance as diagnostic markers for chronic kidney disease. In this study, we have demonstrated the feasibility of using an unsupervised machine learning method without human input in order to predict the level of kidney function in CKD. The results from our study indicate that the visual dictionary, or visual image pattern, obtained from unsupervised machine learning can predict outcomes using machine-derived values that correspond to both known and unknown clinically relevant features.
引用
收藏
页数:13
相关论文
共 48 条
[21]   Evolution of Sequence-based Bioinformatics Tools for Protein-protein Interaction Prediction [J].
Khatun, Mst Shamima ;
Shoombuatong, Watshara ;
Hasan, Md Mehedi ;
Kurata, Hiroyuki .
CURRENT GENOMICS, 2020, 21 (06) :454-463
[22]  
Klapczynski Marcin, 2012, J Pathol Inform, V3, P20, DOI 10.4103/2153-3539.95456
[23]   Association of Pathological Fibrosis With Renal Survival Using Deep Neural Networks [J].
Kolachalama, Vijaya B. ;
Singh, Priyamvada ;
Lin, Christopher Q. ;
Mun, Dan ;
Belghasem, Mostafa E. ;
Henderson, Joel M. ;
Francis, Jean M. ;
Salant, David J. ;
Chitalia, Vipul C. .
KIDNEY INTERNATIONAL REPORTS, 2018, 3 (02) :464-475
[24]   Segmentation of Glomeruli Within Trichrome Images Using Deep Learning [J].
Korman, Shruti ;
Morgan, Laura A. ;
Liang, Benjamin ;
Cheung, McKenzie G. ;
Lin, Christopher Q. ;
Mun, Dan ;
Nader, Ralph G. ;
Belghasem, Mostafa E. ;
Henderson, Joel M. ;
Francis, Jean M. ;
Chitalia, Vipul C. ;
Kolachalama, Vijaya B. .
KIDNEY INTERNATIONAL REPORTS, 2019, 4 (07) :955-962
[25]   Discriminating pseudoprogression and true progression in diffuse infiltrating glioma using multi-parametric MRI data through deep learning [J].
Lee, Joonsang ;
Wang, Nicholas ;
Turk, Sevcan ;
Mohammed, Shariq ;
Lobo, Remy ;
Kim, John ;
Liao, Eric ;
Camelo-Piragua, Sandra ;
Kim, Michelle ;
Junck, Larry ;
Bapuraj, Jayapalli ;
Srinivasan, Ashok ;
Rao, Arvind .
SCIENTIFIC REPORTS, 2020, 10 (01)
[26]   Estimating GFR Using the CKD Epidemiology Collaboration (CKD-EPI) Creatinine Equation: More Accurate GFR Estimates, Lower CKD Prevalence Estimates, and Better Risk Predictions [J].
Levey, Andrew S. ;
Stevens, Lesley A. .
AMERICAN JOURNAL OF KIDNEY DISEASES, 2010, 55 (04) :622-627
[27]   A New Equation to Estimate Glomerular Filtration Rate [J].
Levey, Andrew S. ;
Stevens, Lesley A. ;
Schmid, Christopher H. ;
Zhang, Yaping ;
Castro, Alejandro F., III ;
Feldman, Harold I. ;
Kusek, John W. ;
Eggers, Paul ;
Van Lente, Frederick ;
Greene, Tom ;
Coresh, Josef .
ANNALS OF INTERNAL MEDICINE, 2009, 150 (09) :604-612
[28]   An unsupervised machine learning method for discovering patient clusters based on genetic signatures [J].
Lopez, Christian ;
Tucker, Scott ;
Salameh, Tarik ;
Tucker, Conrad .
JOURNAL OF BIOMEDICAL INFORMATICS, 2018, 85 :30-39
[29]   A PET Radiomics Model to Predict Refractory Mediastinal Hodgkin Lymphoma [J].
Milgrom, Sarah A. ;
Elhalawani, Hesham ;
Lee, Joonsang ;
Wang, Qianghu ;
Mohamed, Abdallah S. R. ;
Dabaja, Bouthaina S. ;
Pinnix, Chelsea C. ;
Gunther, Jillian R. ;
Court, Laurence ;
Rao, Arvind ;
Fuller, Clifton D. ;
Akhtari, Mani ;
Aristophanous, Michalis ;
Mawlawi, Osama ;
Chuang, Hubert H. ;
Sulman, Erik P. ;
Lee, Hun J. ;
Hagemeister, Frederick B. ;
Oki, Yasuhiro ;
Fanale, Michelle ;
Smith, Grace L. .
SCIENTIFIC REPORTS, 2019, 9 (1)
[30]  
NATH KA, 1992, AM J KIDNEY DIS, V20, P1