Applying SoftTriple Loss for Supervised Language Model Fine Tuning

被引:1
|
作者
Sosnowski, Witold [1 ]
Wroblewska, Anna [1 ]
Gawrysiak, Piotr [2 ]
机构
[1] Warsaw Univ Technol, Fac Math & Informat Sci, Warsaw, Poland
[2] Warsaw Univ Technol, Fac Elect & Informat Technol, Warsaw, Poland
来源
PROCEEDINGS OF THE 2022 17TH CONFERENCE ON COMPUTER SCIENCE AND INTELLIGENCE SYSTEMS (FEDCSIS) | 2022年
关键词
SIMILARITY;
D O I
10.15439/2022F185
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a new loss function based on cross entropy and SoftTriple loss, TripleEntropy, to improve classification performance for fine-tuning general knowledge pre-trained language models. This loss function can improve the robust RoBERTa baseline model fine-tuned with cross-entropy loss by about 0.02-2.29 percentage points. Thorough tests on popular datasets using our loss function indicate a steady gain. The fewer samples in the training dataset, the higher gain-thus, for smallsized dataset, it is about 0.71 percentage points, for mediumsized-0.86 percentage points, for large-0.20 percentage points, and for extra-large 0.04 percentage points.
引用
收藏
页码:141 / 147
页数:7
相关论文
共 1 条
  • [1] Knowledge Graph Fusion for Language Model Fine-Tuning
    Bhana, Nimesh
    van Zyl, Terence L.
    2022 9TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE, ISCMI, 2022, : 167 - 172