Federated Learning for Exploiting Annotators' Disagreements in Natural Language Processing

被引：0

作者：

Rodriguez-Barroso, Nuria ^{[1
]}

Camara, Eugenio Martinez ^{[2
]}

Collados, Jose Camacho ^{[3
,4
]}

Luzon, M. Victoria ^{[4
]}

Herrera, Francisco

机构：

[1] Univ Granada, Andalusian Res Inst Data Sci & Computat Intelligen, Dept Comp Sci & Artificial Intelligence, Granada, Spain

[2] Univ Granada, Andalusian Res Inst Data Sci & Computat Intelligen, Dept Software Engn, Granada, Spain

[3] Univ Jaen, Dept Comp Sci, Jaen, Spain

[4] Cardiff Univ, Cardiff, Wales

来源：

TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS | 2024年 / 12卷

关键词：

Compendex;

D O I：

10.1162/tacl_a_00664

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The annotation of ambiguous or subjective NLP tasks is usually addressed by various annotators. In most datasets, these annotations are aggregated into a single ground truth. However, this omits divergent opinions of annotators, hence missing individual perspectives. We propose FLEAD (Federated Learning for Exploiting Annotators' Disagreements), a methodology built upon federated learning to independently learn from the opinions of all the annotators, thereby leveraging all their underlying information without relying on a single ground truth. We conduct an extensive experimental study and analysis in diverse text classification tasks to show the contribution of our approach with respect to mainstream approaches based on majority voting and other recent methodologies that also learn from annotator disagreements.

引用

页码：630 / 648

页数：19

共 20 条

[1] Machine learning in statistical natural language processing
Mochihashi, Daichi
Kyokai Joho Imeji Zasshi/Journal of the Institute of Image Information and Television Engineers, 2015, 69 (02): : 131 - 135
[2] From NLP (Natural Language Processing) to MLP (Machine Language Processing)
Institute for Applied Information Processing and Communications , Graz University of Technology, Austria
不详
不详
Lect. Notes Comput. Sci., (256-269):
[3] An Improved LSTM Structure for Natural Language Processing
Yao, Lirong
Guan, Yazhuo
Proceedings of 2018 IEEE International Conference of Safety Produce Informatization, IICSPI 2018, 2019, : 565 - 569
[4] GlobalBench: A Benchmark for Global Progress in Natural Language Processing
Song, Yueqi
Cui, Catherine
Khanuja, Simran
Liu, Pengfei
Faisal, Fahim
Ostapenko, Alissa
Winata, Genta Indra
Aji, Alham Fikri
Cahyawijaya, Samuel
Tsvetkov, Yulia
Anastasopoulos, Antonios
Neubig, Graham
arXiv, 2023,
[5] Revisiting Pre-trained Language Models and their Evaluation for Arabic Natural Language Processing
Huawei Technologies Co., Ltd.
不详
不详
Proc. Conf. Empir. Methods Nat. Lang. Process., EMNLP, (3135-3151):
[6] Orchestrating the natural language processing software in the cloud computing environment
1600, Digital Information Research Foundation, 2 Srinivasamoorthy Avenue, L.B Road, Adyar, Chennai, 600 020, India (11):
[7] Natural language processing for smart construction: Current status and future directions
Wu, Chengke
Li, Xiao
Guo, Yuanjun
Wang, Jun
Ren, Zengle
Wang, Meng
Yang, Zhile
Automation in Construction, 2022, 134
[8] Extracting phenotypic information from the literature via natural language processing
Chen, Lifeng
Friedman, Carol
Stud. Health Technol. Informatics, 1600, (758-762):
[9] Pinpointing Hardware Trojans Through Semantic Feature Extraction and Natural Language Processing
Li, Yichen
Hu, Wei
Su, Hao
Zhang, Xuelin
Zhao, Yizhi
Wang, Pengjun
Wu, Lingjuan
Proceedings - ITC-Asia 2024: 8th IEEE International Test Conference in Asia, 2024,
[10] PhenoGO: Assigning phenotypic context to gene ontology annotations with natural language processing
Department of Biomedical Informatics, Columbia Center for Systems Biology, United States
不详
Columbia University, New York, NY 10032, United States
Applied Biosystems; International Society for Computational Biology, 1600, 64-75 (2006):

← 1 2 →