Persistent Anti-Muslim Bias in Large Language Models

被引：135

作者：

Abid, Abubakar ^{[1
]}

Farooqi, Maheen ^{[2
]}

Zou, James ^{[1
]}

机构：

[1] Stanford Univ, Stanford, CA 94305 USA

[2] McMaster Univ, Hamilton, ON, Canada

来源：

AIES '21: PROCEEDINGS OF THE 2021 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY | 2021年

关键词：

machine learning; language models; bias; stereotypes; ethics;

D O I：

10.1145/3461702.3462624

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

It has been observed that large-scale language models capture undesirable societal biases, e.g. relating to race and gender; yet religious bias has been relatively unexplored. We demonstrate that GPT-3, a state-of-the-art contextual language model, captures persistent Muslim-violence bias. We probe GPT-3 in various ways, including prompt completion, analogical reasoning, and story generation, to understand this anti-Muslim bias, demonstrating that it appears consistently and creatively in different uses of the model and that it is severe even compared to biases about other religious groups. For instance, "Muslim" is analogized to "terrorist" in 23% of test cases, while "Jewish" is mapped to its most common stereotype, "money," in 5% of test cases. We quantify the positive distraction needed to overcome this bias with adversarial text prompts, and find that use of the most positive 6 adjectives reduces violent completions for "Muslims" from 66% to 20%, but which is still higher than for other religious groups.

引用

页码：298 / 306

页数：9

共 50 条

[21] FairGauge: A Modularized Evaluation of Bias in Masked Language Models
Doughman, Jad
Shehata, Shady
Karray, Fakhri
PROCEEDINGS OF THE 2023 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING, ASONAM 2023, 2023, : 131 - 135
[22] Directionality and representativeness are differentiable components of stereotypes in large language models
Nicolas, Gandalf
Caliskan, Aylin
PNAS NEXUS, 2024, 3 (11):
[23] Symbols and grounding in large language models
Pavlick, Ellie
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2023, 381 (2251):
[24] Flying Into the Future With Large Language Models
Kanjilal, Sanjat
CLINICAL INFECTIOUS DISEASES, 2024, 78 (04) : 867 - 869
[25] Large language models and political science
Linegar, Mitchell
Kocielnik, Rafal
Alvarez, R. Michael
FRONTIERS IN POLITICAL SCIENCE, 2023, 5
[26] Political Bias in Large Language Models: A Comparative Analysis of ChatGPT-4, Perplexity, Google Gemini, and Claude
Choudhary, Tavishi
IEEE ACCESS, 2025, 13 : 11341 - 11379
[27] Quo Vadis ChatGPT? From large language models to Large Knowledge Models
Venkatasubramanian, Venkat
Chakraborty, Arijit
COMPUTERS & CHEMICAL ENGINEERING, 2025, 192
[28] Accelerating Contextualization in AI Large Language Models Using Vector Databases
Bin Tareaf, Raad
AbuJarour, Mohammed
Engelman, Tom
Liermann, Philipp
Klotz, Jesse
38TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING, ICOIN 2024, 2024, : 316 - 321
[29] Modeling Global and local Codon Bias with Deep Language Models
Fujimoto, M. Stanley
Bodily, Paul M.
Lyman, Cole A.
Jacobsen, J. Andrew
Snell, Quinn
Clement, Mark J.
2017 IEEE 17TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2017, : 151 - 156
[30] Systematic testing of three Language Models reveals low language accuracy, absence of response stability, and a yes- response bias
Dentella, Vittoria
Guenther, Fritz
Leivada, Evelina
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2023, 120 (51)

← 1 2 3 4 5 →