Wasserstein GAN-Based Small-Sample Augmentation for New-Generation Artificial Intelligence: A Case Study of Cancer-Staging Data in Biology

被引:114
作者
Liu, Yufei [1 ,2 ,4 ]
Zhou, Yuan [2 ]
Liu, Xin [1 ]
Dong, Fang [2 ]
Wang, Chang [1 ]
Wang, Zihong [3 ]
机构
[1] Huazhong Univ Sci & Technol, Coll Life Sci & Technol, Wuhan 430074, Hubei, Peoples R China
[2] Tsinghua Univ, Sch Publ Policy & Management, Beijing 100084, Peoples R China
[3] Huazhong Univ Sci & Technol, Sch Mech Sci & Engn, Wuhan 430074, Hubei, Peoples R China
[4] Chinese Acad Engn, Ctr Strateg Studies, Beijing 100088, Peoples R China
基金
中国国家自然科学基金;
关键词
Artificial intelligence; Generative adversarial network; Deep neural network; Small sample size; Cancer; GREEN-MANUFACTURING TECHNOLOGIES; OVARIAN-CANCER; WIND ENERGY; KNOWLEDGE; CHINA; DIFFUSION; INNOVATION; NETWORKS; GLYCANS; EUROPE;
D O I
10.1016/j.eng.2018.11.018
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
It is essential to utilize deep-learning algorithms based on big data for the implementation of the new generation of artificial intelligence. Effective utilization of deep learning relies considerably on the number of labeled samples, which restricts the application of deep learning in an environment with a small sample size. In this paper, we propose an approach based on a generative adversarial network (GAN) combined with a deep neural network (DNN). First, the original samples were divided into a training set and a test set. The GAN was trained with the training set to generate synthetic sample data, which enlarged the training set. Next, the DNN classifier was trained with the synthetic samples. Finally, the classifier was tested with the test set, and the effectiveness of the approach for multi-classification with a small sample size was validated by the indicators. As an empirical case, the approach was then applied to identify the stages of cancers with a small labeled sample size. The experimental results verified that the proposed approach achieved a greater accuracy than traditional methods. This research was an attempt to transform the classical statistical machine-learning classification method based on original samples into a deep-learning classification method based on data augmentation. The use of this approach will contribute to an expansion of application scenarios for the new generation of artificial intelligence based on deep learning, and to an increase in application effectiveness. This research is also expected to contribute to the comprehensive promotion of new-generation artificial intelligence. (C) 2019 THE AUTHORS. Published by Elsevier LTD on behalf of Chinese Academy of Engineering and Higher Education Press Limited Company.
引用
收藏
页码:156 / 163
页数:8
相关论文
共 78 条
[1]   Glycans as cancer biomarkers [J].
Adamczyk, Barbara ;
Tharmalingam, Tharmala ;
Rudd, Pauline M. .
BIOCHIMICA ET BIOPHYSICA ACTA-GENERAL SUBJECTS, 2012, 1820 (09) :1347-1353
[2]  
Al-Qizwini M, 2017, IEEE INT VEH SYM, P89, DOI 10.1109/IVS.2017.7995703
[3]  
[Anonymous], COMMUNICATION CONTRO
[4]  
[Anonymous], DEV PLAN NEXT GEN AR
[5]  
[Anonymous], P INT C PATT REC
[6]  
[Anonymous], INTEGRATIVE MACHINE
[7]  
[Anonymous], ARXIV151203131
[8]  
[Anonymous], 2017, ARXIV170107875
[9]  
[Anonymous], POL INT DEV PLANN NE
[10]  
[Anonymous], ARXIV170106547