Deep curriculum learning optimization

被引：0

作者：

Ghebrechristos H. ^{[1
]}

Alaghband G. ^{[1
]}

机构：

[1] Department of Computer Science, University of Colorado, Denver, 80014, CO

来源：

SN Computer Science | 2020年 / 1卷 / 5期

关键词：

Convolutional neural network; Curriculum learning optimization; Curriculum strategy; Deep learning; Information theory; Syllabus;

D O I：

10.1007/s42979-020-00251-7

中图分类号：

学科分类号：

摘要：

We describe a quantitative and practical framework to integrate curriculum learning (CL) into deep learning training pipeline to improve feature learning in deep feed-forward networks. The framework has several unique characteristics: (1) dynamicity—it proposes a set of batch-level training strategies (syllabi or curricula) that are sensitive to data complexity (2) adaptivity—it dynamically estimates the effectiveness of a given strategy and performs objective comparison with alternative strategies making the method suitable both for practical and research purposes. (3) Employs replace–retrain mechanism when a strategy is unfit to the task at hand. In addition to these traits, the framework can combine CL with several variants of gradient descent (GD) algorithms and has been used to generate efficient batch-specific or data-set specific strategies. Comparative studies of various current state-of-the-art vision models, such as FixEfficentNet and BiT-L (ResNet), on several benchmark datasets including CIFAR10 demonstrate the effectiveness of the proposed method. We present results that show training loss reduction by as much as a factor 5. Additionally, we present a set of practical curriculum strategies to improve the generalization performance of select networks on various datasets. © 2020, Springer Nature Singapore Pte Ltd.

引用

共 39 条

[1]

Bengio Y., Louradour J., Collobert R., Weston J., Curriculum Learning, (2009)

[2]

Proceedings of the 34th international conference on machine learning, vol. 70, ICML’17, 2017, pp. 1311-1320

[3]

Avramova V., Curriculum Learning with Deep Convolutional Neural Networks. Thesis. KTH Royal Institute of Technology, (2015)

[4]

Weinshall D., Cohen G., Amir D., (1802)

[5]

Henok G., Gita A., Information theory-based curriculum learning factory to optimize training, Asian Conference on Pattern Recognition, (2019)

[6]

Zhang C., Bengio S., Hardt M., Recht B., Vinyals O., Understanding deep learning requires rethinking generalization, Arxiv161103530 Cs, Nov, (2016)

[7]

Martin C.H., Mahoney M.W., Rethinking Generalization Requires Revisiting Old Ideas: Statistical Mechanics Approaches and Complex Learning Behavior, (2017)

[8]

Szegedy C., (2013)

[9]

Ghebrechristos H., Alaghband G., Expediting training using information theory-based patch ordering algorithm, (2018)

[10]

Hornik K., Stinchcombe M., White H., Multilayer feedforward networks are universal approximators, Neural Netw, 2, 5, pp. 359-366, (1989)

← 1 2 3 4 →