Unsupervised Post-Tuning of Deep Neural Networks

被引:2
作者
Cerisara, Christophe [1 ]
Caillon, Paul [1 ]
Le Berre, Guillaume [1 ]
机构
[1] Univ Lorraine, CNRS, LORIA, F-54000 Nancy, France
来源
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2021年
关键词
deep learning; unsupervised training; regularization; natural language processing;
D O I
10.1109/IJCNN52387.2021.9534198
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose in this work a new unsupervised training procedure that is most effective when it is applied after supervised training and fine-tuning of deep neural network classifiers. While standard regularization techniques combat overfitting by means that are unrelated to the target classification loss, such as by minimizing the L2 norm or by adding noise either in the data, model or process, the proposed unsupervised training loss reduces overfitting by optimizing the true classifier risk. The proposed approach is evaluated on several tasks of increasing difficulty and varying conditions: unsupervised training, post-tuning and anomaly detection. It is also tested both on simple neural networks, such as small multi-layer perceptron, and complex Natural Language Processing models, e.g., pretrained BERT embeddings. Experimental results confirm the theory and show that the proposed approach gives the best results in post-tuning conditions, i.e., when applied after supervised training and fine-tuning.
引用
收藏
页数:8
相关论文
共 32 条
[21]  
Pennington J., 2014, P 2014 C EMP METH NA, P1532, DOI [10.3115/v1/D14-1162, DOI 10.3115/V1/D14-1162]
[22]   Adaboost-LLP: A Boosting Method for Learning With Label Proportions [J].
Qi, Zhiquan ;
Meng, Fan ;
Tian, Yingjie ;
Niu, Lingfeng ;
Shi, Yong ;
Zhang, Peng .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (08) :3548-3559
[23]  
Razin Noam, 2020, ARXIV200506398
[24]  
REEVE H. W. J., 2019, Proceedings of Machine Learning Research, V99, P1
[25]  
Ruder S., 2020, Nlp progress
[26]  
Ruff L, 2018, PMLR, P4393, DOI DOI 10.1109/DSW.2019.8755576
[27]  
Ruff L, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P4061
[28]  
Tschannen M., 2018, 3 WORKSH BAYES DEEP
[29]  
Warstadt Alex, 2018, CoRR
[30]  
Wolberg WilliamH., 1992, Uci machine learning repository - breast cancer wisconsin (diagnostic) dataset