Deep Learning-Based Hepatocellular Carcinoma Histopathology Image Classification: Accuracy Versus Training Dataset Size

被引:24
作者
Lin, Yu-Shiang [1 ]
Huang, Pei-Hsin [2 ,3 ]
Chen, Yung-Yaw [1 ]
机构
[1] Natl Taiwan Univ, Dept Elect Engn, Taipei 10617, Taiwan
[2] Natl Taiwan Univ, Coll Med, Grad Inst Pathol, Taipei 10617, Taiwan
[3] Natl Taiwan Univ Hosp, Dept Pathol, Taipei 10617, Taiwan
关键词
Training; Cancer; Histopathology; Liver; Deep learning; Image classification; Testing; Convolutional neural network; deep learning; hepatocellular carcinoma; histopathology image classification; inverse power law function-based fitting curve regression;
D O I
10.1109/ACCESS.2021.3060765
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Globally, liver cancer causes more than 700,000 deaths each year and is the second-leading cause of death from cancer. Hepatocellular carcinoma (HCC) is the most common type of liver cancer in adults and accounts for most deaths in cirrhosis patients. Patients with early-stage liver cancer can be treated by surgical intervention with a good prognosis; thus, early diagnosis, as confirmed by liver pathology examination, is necessary to combat HCC. Conventional manual pathology examination requires considerable time and labor, even with established expertise. It is widely accepted that intelligent classifiers may prove effective in the diagnosis process. In this study, we used a GoogLeNet (Inception-V1)-based binary classifier to classify HCC histopathology images. The classifier achieved 91.37% (+/- 2.49) accuracy, 92.16% (+/- 4.93) sensitivity, and 90.57% (+/- 2.54) specificity in HCC classification. Although the classification accuracy of deep learning is reported to be positively correlated with the amount of training data, it is often uncertain how much training data are required for deep learning to achieve satisfactory performance in clinical diagnosis. Moreover, deep learning methods require annotated data to generate efficient models. However, annotated data are a relatively scarce resource and can be expensive to obtain. Hence, the relationship between classification accuracy and the number of liver histopathology images for training was investigated. An inverse power law function-based estimation model is proposed to evaluate the minimum number of annotated training images required for a desired diagnostic accuracy.
引用
收藏
页码:33144 / 33157
页数:14
相关论文
共 23 条
[1]  
[Anonymous], 2015, ARXIV151106348
[2]  
[Anonymous], 2020, KEY STAT LIVER CANCS
[3]   Data Sampling Approaches with Severely Imbalanced Big Data for Medicare Fraud Detection [J].
Bauder, Richard A. ;
Khoshgoftaar, Taghi M. ;
Hasanin, Tawfiq .
2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, :137-142
[4]   Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer [J].
Bejnordi, Babak Ehteshami ;
Veta, Mitko ;
van Diest, Paul Johannes ;
van Ginneken, Bram ;
Karssemeijer, Nico ;
Litjens, Geert ;
van der Laak, Jeroen A. W. M. .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2017, 318 (22) :2199-2210
[5]   Classification and mutation prediction based on histopathology H&E images in liver cancer using deep learning [J].
Chen, Mingyu ;
Zhang, Bin ;
Topatana, Win ;
Cao, Jiasheng ;
Zhu, Hepan ;
Juengpanich, Sarun ;
Mao, Qijiang ;
Yu, Hong ;
Cai, Xiujun .
NPJ PRECISION ONCOLOGY, 2020, 4 (01)
[6]  
Damien A, 2020, TFLEARN TENSORFLOW D
[7]   Predicting sample size required for classification performance [J].
Figueroa, Rosa L. ;
Zeng-Treitler, Qing ;
Kandula, Sasikiran ;
Ngo, Long H. .
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2012, 12
[8]   Hepatocellular carcinoma [J].
Forner, Alejandro ;
Llovet, Josep M. ;
Bruix, Jordi .
LANCET, 2012, 379 (9822) :1245-1255
[9]  
Huang WC, 2019, 2019 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2019), P194, DOI [10.1109/AICAS.2019.8771535, 10.1109/aicas.2019.8771535]
[10]   Survey on deep learning with class imbalance [J].
Johnson, Justin M. ;
Khoshgoftaar, Taghi M. .
JOURNAL OF BIG DATA, 2019, 6 (01)