Debiasing Pre-Trained Language Models via Efficient Fine-Tuning

被引：0

作者：

Gira, Michael ^{[1
]}

Zhang, Ruisu ^{[1
]}

Lee, Kangwook ^{[1
]}

机构：

[1] Univ Wisconsin Madison, Madison, WI 53706 USA

来源：

PROCEEDINGS OF THE SECOND WORKSHOP ON LANGUAGE TECHNOLOGY FOR EQUALITY, DIVERSITY AND INCLUSION (LTEDI 2022) | 2022年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

An explosion in the popularity of transformer-based language models (such as GPT-3, BERT, RoBERTa, and ALBERT) has opened the doors to new machine learning applications involving language modeling, text generation, and more. However, recent scrutiny reveals that these language models contain inherent biases towards certain demographics reflected in their training data. While research has tried mitigating this problem, existing approaches either fail to remove the bias completely, degrade performance ("catastrophic forgetting"), or are costly to execute. This work examines how to reduce gender bias in a GPT-2 language model by fine-tuning less than 1% of its parameters. Through quantitative benchmarks, we show that this is a viable way to reduce prejudice in pre-trained language models while remaining cost-effective at scale.

引用

页码：59 / 69

页数：11

共 46 条

[1]

Abramson J., 2020, arXiv

[2]

Agarwal A, 2019, Arxiv, DOI arXiv:1905.12843

[3]

[Anonymous], Openimages

[4] On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? [J].

Bender, Emily M. ;

Gebru, Timnit ;

McMillan-Major, Angelina ;

Shmitchell, Shmargaret .

PROCEEDINGS OF THE 2021 ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, FACCT 2021, 2021, :610-623

[5]

Berk R, 2017, Arxiv, DOI arXiv:1706.02409

[6]

Bolukbasi T, 2016, ADV NEUR IN, V29

[7]

Brown TB, 2020, ADV NEUR IN, V33

[8] Semantics derived automatically from language corpora contain human-like biases [J].

Caliskan, Aylin ;

Bryson, Joanna J. ;

Narayanan, Arvind .

SCIENCE, 2017, 356 (6334) :183-186

[9]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[10]

Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]

← 1 2 3 4 5 →