What's in a Name? Probabilistic Inference of Religious Community from South Asian Names

被引:14
作者
Susewind, Raphael [1 ,2 ]
机构
[1] Univ Bielefeld, D-33615 Bielefeld, Germany
[2] Univ Oxford, Oxford, England
关键词
names; religion; South Asia; linguistics; big data; ETHNICITY;
D O I
10.1177/1525822X14564275
中图分类号
Q98 [人类学];
学科分类号
030303 ;
摘要
Fine-grained data on religious communities are often considered sensitive in South Asia and consequently remain inaccessible. Yet without such data, statistical research on communal relations and group-based inequality remains superficial, hampering the development of appropriate policy measures to prevent further social exclusion on the basis of religion. The open-source algorithm introduced in this article provides a workaround by probabilistically exploiting the communal connotations of names; it transforms name listswhich are readily availableinto a new source of demographic data. The algorithm proves highly accurate in identifying Muslim population shares in Uttar Pradesh, India's most populous state, but could be employed more widely across South Asia. It potentially enables more detailed analyses in economics, development studies, and political science as well as better sampling procedures in sociology and anthropology. This article describes the algorithm, evaluates its accuracy, reflects on ethical implications, and introduces a sample data set; the software itself is available in an online supplement to this article.
引用
收藏
页码:319 / 332
页数:14
相关论文
共 18 条
  • [1] Basant R., 2010, HDB MUSLIMS INDIA EM, P1
  • [2] Bhalotra S., 2013, POLITICIAN IDENTITY
  • [3] CRITICAL QUESTIONS FOR BIG DATA Provocations for a cultural, technological, and scholarly phenomenon
    Boyd, Danah
    Crawford, Kate
    [J]. INFORMATION COMMUNICATION & SOCIETY, 2012, 15 (05) : 662 - 679
  • [4] Segregation, rent control, and riots: The economics of religious conflict in an Indian city
    Field, Erica
    Levinson, Matthew
    Pande, Rohini
    Visaria, Sujata
    [J]. AMERICAN ECONOMIC REVIEW, 2008, 98 (02) : 505 - 510
  • [5] Galonnier J., 2012, MUSLIMS INDIAN CITIE, P129
  • [6] Gayer Laurent., 2012, Muslims in Indian Cities: Trajectories of Marginalisation, P1
  • [7] Haroon M., 1984, CATALOGUING INDIAN M
  • [8] Jaffrelot Christophe., 2009, RISE PLEBEIANS CHANG
  • [9] Jaffrelot Christophe., 2012, Economic and Political Weekly, P89
  • [10] Determining aspects of ethnicity amongst persons of South Asian origin: The use of a surname-classification programme (Nam Pehchan)
    Macfarlane, Gary J.
    Lunt, Mark
    Palmer, Benedict
    Afzal, Cara
    Silman, Alan J.
    Esmail, Aneez
    [J]. PUBLIC HEALTH, 2007, 121 (03) : 231 - 236