Applying SoftTriple Loss for Supervised Language Model Fine Tuning

被引：1

作者：

Sosnowski, Witold ^{[1
]}

Wroblewska, Anna ^{[1
]}

Gawrysiak, Piotr ^{[2
]}

机构：

[1] Warsaw Univ Technol, Fac Math & Informat Sci, Warsaw, Poland

[2] Warsaw Univ Technol, Fac Elect & Informat Technol, Warsaw, Poland

来源：

PROCEEDINGS OF THE 2022 17TH CONFERENCE ON COMPUTER SCIENCE AND INTELLIGENCE SYSTEMS (FEDCSIS) | 2022年

关键词：

SIMILARITY;

D O I：

10.15439/2022F185

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We introduce a new loss function based on cross entropy and SoftTriple loss, TripleEntropy, to improve classification performance for fine-tuning general knowledge pre-trained language models. This loss function can improve the robust RoBERTa baseline model fine-tuned with cross-entropy loss by about 0.02-2.29 percentage points. Thorough tests on popular datasets using our loss function indicate a steady gain. The fewer samples in the training dataset, the higher gain-thus, for smallsized dataset, it is about 0.71 percentage points, for mediumsized-0.86 percentage points, for large-0.20 percentage points, and for extra-large 0.04 percentage points.

引用

页码：141 / 147

页数：7