Persistent Anti-Muslim Bias in Large Language Models

被引:140
|
作者
Abid, Abubakar [1 ]
Farooqi, Maheen [2 ]
Zou, James [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] McMaster Univ, Hamilton, ON, Canada
来源
AIES '21: PROCEEDINGS OF THE 2021 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY | 2021年
关键词
machine learning; language models; bias; stereotypes; ethics;
D O I
10.1145/3461702.3462624
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It has been observed that large-scale language models capture undesirable societal biases, e.g. relating to race and gender; yet religious bias has been relatively unexplored. We demonstrate that GPT-3, a state-of-the-art contextual language model, captures persistent Muslim-violence bias. We probe GPT-3 in various ways, including prompt completion, analogical reasoning, and story generation, to understand this anti-Muslim bias, demonstrating that it appears consistently and creatively in different uses of the model and that it is severe even compared to biases about other religious groups. For instance, "Muslim" is analogized to "terrorist" in 23% of test cases, while "Jewish" is mapped to its most common stereotype, "money," in 5% of test cases. We quantify the positive distraction needed to overcome this bias with adversarial text prompts, and find that use of the most positive 6 adjectives reduces violent completions for "Muslims" from 66% to 20%, but which is still higher than for other religious groups.
引用
收藏
页码:298 / 306
页数:9
相关论文
共 50 条
  • [1] Gender bias and stereotypes in Large Language Models
    Kotek, Hadas
    Dockum, Rikker
    Sun, David Q.
    PROCEEDINGS OF THE ACM COLLECTIVE INTELLIGENCE CONFERENCE, CI 2023, 2023, : 12 - 24
  • [2] The Sources of Anti-Muslim Prejudice in Poland
    Golebiowska, Ewa
    EAST EUROPEAN POLITICS AND SOCIETIES, 2018, 32 (04) : 796 - 817
  • [3] Anti-Muslim Bias in Foreign Policy Attitudes: Experimental Evidence from Thirteen European Countries
    Findor, Andrej
    Hlatky, Roman
    Hruska, Matej
    Kironska, Kristina
    BRITISH JOURNAL OF POLITICAL SCIENCE, 2025, 55
  • [4] The Paradox of the Moderate Muslim Discourse: Subtyping Promotes Support for Anti-muslim Policies
    Hakim, Nader H.
    Zhao, Xian
    Bharj, Natasha
    FRONTIERS IN PSYCHOLOGY, 2020, 11
  • [5] Quantifying Bias in Agentic Large Language Models: A Benchmarking Approach
    Fernando, Riya
    Norton, Isabel
    Dogra, Pranay
    Sarnaik, Rohit
    Wazir, Hasan
    Ren, Zitang
    Gunda, Niveta Sree
    Mukhopadhyay, Anushka
    Lutz, Michael
    2024 5TH INFORMATION COMMUNICATION TECHNOLOGIES CONFERENCE, ICTC 2024, 2024, : 349 - 353
  • [6] Assessing political bias in large language models
    Rettenberger, Luca
    Reischl, Markus
    Schutera, Mark
    JOURNAL OF COMPUTATIONAL SOCIAL SCIENCE, 2025, 8 (02):
  • [7] Hate, amplified? Social media news consumption and support for anti-Muslim policies
    Lajevardi, Nazita
    Oskooii, Kassra A. R.
    Walker, Hannah
    JOURNAL OF PUBLIC POLICY, 2022, 42 (04) : 656 - 683
  • [8] News media, movies, and anti-Muslim prejudice: investigating the role of social contact
    Ahmed, Saifuddin
    ASIAN JOURNAL OF COMMUNICATION, 2017, 27 (05) : 536 - 553
  • [9] A survey on multilingual large language models: corpora, alignment, and bias
    Xu, Yuemei
    Hu, Ling
    Zhao, Jiayi
    Qiu, Zihan
    Xu, Kexin
    Ye, Yuqi
    Gu, Hanwen
    FRONTIERS OF COMPUTER SCIENCE, 2025, 19 (11)
  • [10] Anti-Muslim attitudes among the police? The relationship between contact frequency and contact quality with prejudice and stereotyping towards Muslims
    Kemme, Stefanie
    Essien, Iniobong
    Stelter, Marleen
    MONATSSCHRIFT FUR KRIMINOLOGIE UND STRAFRECHTSREFORM, 2020, 103 (02): : 129 - 149