Gender Bias and Under-Representation in Natural Language Processing Across Human Languages

被引:18
|
作者
Chen, Yan [1 ]
Mahoney, Christopher [1 ]
Grasso, Isabella [1 ]
Wali, Esma [1 ]
Matthews, Abigail [2 ]
Middleton, Thomas [1 ]
Njie, Mariama [3 ]
Matthews, Jeanna [1 ]
机构
[1] Clarkson Univ, Potsdam, NY 13676 USA
[2] Univ Wisconsin Madison, Madison, WI USA
[3] Iona Coll, New York, NY USA
来源
AIES '21: PROCEEDINGS OF THE 2021 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY | 2021年
关键词
bias; gender bias; natural language processing;
D O I
10.1145/3461702.3462530
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural Language Processing (NLP) systems are at the heart of many critical automated decision-making systems making crucial recommendations about our future world. However, these systems reflect a wide range of biases, from gender bias to a bias in which voices they represent. In this paper, a team including speakers of 9 languages - Chinese, Spanish, English, Arabic, German, French, Farsi, Urdu, and Wolof - reports and analyzes measurements of gender bias in the Wikipedia corpora for these 9 languages. In the process, we also document how our work exposes crucial gaps in the NLP-pipeline for many languages. Despite substantial investments in multilingual support, the modern NLP-pipeline still systematically and dramatically under-represents the majority of human voices in the NLP-guided decisions that are shaping our collective future. We develop extensions to profession-level and corpus-level gender bias metric calculations originally designed for English and apply them to 8 other languages, including languages like Spanish, Arabic, German, French and Urdu that have grammatically gendered nouns including different feminine, masculine and neuter profession words. We compare these gender bias measurements across the Wikipedia corpora in different languages as well as across some corpora of more traditional literature.
引用
收藏
页码:24 / 34
页数:11
相关论文
共 50 条
  • [21] Intent-driven network representation based on natural language processing
    Ji Z.
    Yang C.
    Li F.
    Ouyang Y.
    Liu X.
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2024, 46 (01): : 318 - 325
  • [22] Evaluation of bias and gender/racial concordance based on sentiment analysis of narrative evaluations of clinical clerkships using natural language processing
    Bhanvadia, Sonali
    Saseendrakumar, Bharanidharan Radha
    Guo, Joy
    Spadafore, Maxwell
    Daniel, Michelle
    Lander, Lina
    Baxter, Sally L.
    BMC MEDICAL EDUCATION, 2024, 24 (01)
  • [23] Preface: Special issue on Natural Language Processing applications for low-resource languages
    Pakray, Partha
    Gelbukh, Alexander
    Bandyopadhyay, Sivaji
    NATURAL LANGUAGE PROCESSING, 2025, 31 (02): : 181 - 182
  • [24] Indic SentiReview: Natural Language Processing based Sentiment Analysis on major Indian Languages
    Hadiya, Nidhi
    Nanavati, Nirali
    PROCEEDINGS OF THE 2019 3RD INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC 2019), 2019, : 322 - 327
  • [25] A New Method for Graph-Based Representation of Text in Natural Language Processing
    Probierz, Barbara
    Hrabia, Anita
    Kozak, Jan
    ELECTRONICS, 2023, 12 (13)
  • [26] Natural language processing: using artificial intelligence to understand human language in orthopedics
    James A. Pruneski
    Ayoosh Pareek
    Benedict U. Nwachukwu
    R. Kyle Martin
    Bryan T. Kelly
    Jón Karlsson
    Andrew D. Pearle
    Ata M. Kiapour
    Riley J. Williams
    Knee Surgery, Sports Traumatology, Arthroscopy, 2023, 31 : 1203 - 1211
  • [27] Natural language processing: using artificial intelligence to understand human language in orthopedics
    Pruneski, James A.
    Pareek, Ayoosh
    Nwachukwu, Benedict U.
    Martin, R. Kyle
    Kelly, Bryan T.
    Karlsson, Jon
    Pearle, Andrew D.
    Kiapour, Ata M.
    Williams, Riley J.
    KNEE SURGERY SPORTS TRAUMATOLOGY ARTHROSCOPY, 2023, 31 (04) : 1203 - 1211
  • [28] Natural Language Processing in Spine Surgery: A Systematic Review of Applications, Bias, and Reporting Transparency
    Huang, Bonnie B.
    Huang, Jonathan
    Swong, Kevin N.
    WORLD NEUROSURGERY, 2022, 167 : 156 - +
  • [29] Towards Symbiosis in Knowledge Representation and Natural Language Processing for Structuring Clinical Practice Guidelines
    Weng, Chunhua
    Payne, Philip R. O.
    Velez, Mark
    Johnson, Stephen B.
    Bakken, Suzanne
    NURSING INFORMATICS 2014: EAST MEETS WEST ESMART+, 2014, 201 : 461 - 469
  • [30] Unit Under Test Identification Using Natural Language Processing Techniques
    Madeja, Matej
    Poruban, Jaroslav
    OPEN COMPUTER SCIENCE, 2021, 11 (01) : 22 - 32