Natural Sciences Meet Social Sciences: Census Data Analytics for Detecting Home Language Shifts

被引:11
作者
Choy, Christian M. [1 ]
Co, M. Kiefer [1 ]
Fogel, Matthew J. [1 ]
Garrioch, Clarke D. [1 ]
Leung, Carson K. [1 ]
Martchenko, Ekaterina [1 ]
机构
[1] Univ Manitoba, Dept Comp Sci, Winnipeg, MB, Canada
来源
PROCEEDINGS OF THE 2021 15TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM 2021) | 2021年
基金
加拿大自然科学与工程研究理事会;
关键词
information management; data science; data analytics; data mining; census data; language cohorts; allophones; mother tongue; language persistence; BIG DATA;
D O I
10.1109/IMCOM51814.2021.9377412
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As we are living in a global environment, it is not unusual to have more than one languages or dialects used in a country. Examples include Canada in the Americas, Singapore in Asia, and Switzerland in Europe. With the initiatives of globalization, many people immigrate or live in a country other than their birthplace. As a result, different people in the same country may have different home language (i.e., first language). For instance, as a nation composed of a highly diverse language population, Canada provides a unique opportunity to study the factors causing certain languages (or families of language) to be lost over subsequent generations among allophones (i.e., people whose mother tongue is neither English or French). In this paper, we focus on census data analytics. Specifically, we analyze census microdata by exploring machine learning and data mining techniques-such as decision tree induction, random forest, and categorical naive Bayes-to study the influence of various social and economic factors on the probability that allophones adopt official languages as their language spoken at home. This study is a showcase where natural sciences and engineering (NSE) meet social sciences, in which NSE solutions (e.g., census data analytics) are applicable for the study of social science related phenomena (e.g., successful detection of shifts in home languages).
引用
收藏
页数:8
相关论文
共 33 条
[1]  
[Anonymous], 1965, STATEMENT PRIME MINI
[2]  
Bin S., ICACTE 2010
[3]   A new framework for mining weighted periodic patterns in time series databases [J].
Chanda, Ashis Kumar ;
Ahmed, Chowdhury Farhan ;
Samiullah, Md ;
Leung, Carson K. .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 79 :207-224
[4]   A Data Science and Engineering Solution for Fast k-Means Clustering of Big Data [J].
Dierckens, Karl E. ;
Harrison, Adrian B. ;
Leung, Carson K. ;
Pind, Adrienne V. .
2017 16TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS / 11TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING / 14TH IEEE INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS, 2017, :925-932
[5]  
Domingos P., 2000, Proceedings. KDD-2000. Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, P71, DOI 10.1145/347090.347107
[6]  
Fariha A., PAKDD 2013 1, P38
[7]  
Hulten G., 2001, KDD-2001. Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, P97, DOI 10.1145/502512.502529
[8]  
Imai M, IMCOM 2020, P200
[9]  
Ishita S.Z, IMCOM 2019, P897
[10]   A Data Analytic Algorithm for Managing, Querying, and Processing Uncertain Big Data in Cloud Environments [J].
Jiang, Fan ;
Leung, Carson K. .
ALGORITHMS, 2015, 8 (04) :1175-1194