Bias Against 93 Stigmatized Groups in Masked Language Models and Downstream Sentiment Classification Tasks

被引：23

作者：

Mei, Katelyn X. ^{[1
]}

Fereidooni, Sonia ^{[1
]}

Caliskan, Aylin ^{[1
]}

机构：

[1] Univ Washington, Seattle, WA 98195 USA

来源：

PROCEEDINGS OF THE 6TH ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, FACCT 2023 | 2023年

关键词：

AI ethics; AI bias; stigma in language models; language models; representation learning; sentiment classification; prompting; UNITED-STATES;

D O I：

10.1145/3593013.3594109

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The rapid deployment of artificial intelligence (AI) models de-demands a thorough investigation of biases and risks inherent in these models to understand their impact on individuals and society. A growing body of work has shown that social biases are encoded in language models and their downstream tasks. This study extends the focus of bias evaluation in extant work by examining bias against social stigmas on a large scale. It focuses on 93 stigmatized groups in the United States, including a wide range of conditions related to disease, disability, drug use, mental illness, religion, sexuality, socioeconomic status, and other relevant factors. We investigate bias against these groups in English pre-trained Masked Language Models (MLMs) and their downstream sentiment classification tasks. To evaluate the presence of bias against 93 stigmatized conditions, we identify 29 non-stigmatized conditions to conduct a comparative analysis. Building upon a psychology scale of social rejection, the Social Distance Scale, we prompt six MLMs that are trained with different datasets: RoBERTa-base, RoBERTa-large, XLNet-large, BERTweet-base, BERTweet-large, and DistilBERT. We use human annotations to analyze the predicted words from these models, with which we measure the extent of bias against stigmatized groups. When prompts include stigmatized conditions, the probability of MLMs predicting negative words is, on average, 20 percent higher than when prompts have non-stigmatized conditions. Bias against stigmatized groups is also reflected in four downstream sentiment classifiers of these models. When sentences include stigmatized conditions related to diseases, disability, education, and mental illness, they are more likely to be classified as negative. For example, the sentence "They are people who have less than a high school education." is classified as negative consistently across all models. We also observe a strong correlation between bias in MLMs and their downstream sentiment classifiers (Pearson's r =0.79). The evidence indicates that MLMs and their downstream sentiment classification tasks exhibit biases against socially stigmatized groups.

引用

页码：1699 / 1710

页数：12

共 55 条

[1] Persistent Anti-Muslim Bias in Large Language Models [J].

Abid, Abubakar ;

Farooqi, Maheen ;

Zou, James .

AIES '21: PROCEEDINGS OF THE 2021 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, 2021, :298-306

[2] SOCIAL DISTANCE FROM THE STIGMATIZED - A TEST OF 2 THEORIES [J].

ALBRECHT, GL ;

WALKER, VG ;

LEVY, JA .

SOCIAL SCIENCE & MEDICINE, 1982, 16 (14) :1319-1327

[3]

Alnegheimish Sarah, 2022, PROC 2022 C N AM CHA, P2824, DOI [DOI 10.18653/V1/2022.NAACL-MAIN.2032, 10.18653/v1/2022.naaclmain.203]

[4]

[Anonymous], 2022, HF Canonical Model Maintainers, DOI [10.57967/hf/0181, DOI 10.57967/HF/0181]

[5]

Blodgett SL, 2020, Arxiv, DOI [arXiv:2005.14050, 10.48550/arXiv.2005.14050, DOI 10.48550/ARXIV.2005.14050]

[6]

Bordia S, 2019, NAACL HLT 2019: THE 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, P7

[7]

Brown TB, 2020, ADV NEUR IN, V33

[8] Semantics derived automatically from language corpora contain human-like biases [J].

Caliskan, Aylin ;

Bryson, Joanna J. ;

Narayanan, Arvind .

SCIENCE, 2017, 356 (6334) :183-186

[9] An attribution model of public discrimination towards persons with mental illness [J].

Corrigan, P ;

Markowitz, FE ;

Watson, A ;

Rowan, D ;

Kubiak, MA .

JOURNAL OF HEALTH AND SOCIAL BEHAVIOR, 2003, 44 (02) :162-179

[10]

Corrigan Patrick W., 2015, Stigma of disease and disability: Understanding causes and overcoming injustices

← 1 2 3 4 5 6 →