Explicit Content Detection in Music Lyrics Using Machine Learning

被引：13

作者：

Chin, Hyojin ^{[1
]}

Kim, Jayong ^{[1
]}

Kim, Yoonjong ^{[1
]}

Shin, Jinseop ^{[1
]}

Yi, Mun. Y. ^{[1
]}

机构：

[1] Korea Adv Inst Sci & Technol, Grad Sch Knowledge Serv Engn, Daejeon, South Korea

来源：

2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP) | 2018年

关键词：

Machine Learning; NLP; Explicit Contents; Music; Lyrics; Abusive Language; Adolescent Safety; Parent Advisory Lable; HEAVY-METAL;

D O I：

10.1109/BigComp.2018.00085

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Music has serious effects on children's development. Music lyrics have become more violent and sexual over the years. However, the system for filtering explicit contents in music often does not work properly, not to mention that it takes a lot of time and effort to do it properly. In this study, we propose several machine learning models that automatically detect explicit contents in Korean lyrics and compare their performances. The proposed Bagging with selective vocabulary model outperformed not only the other competing models we designed, but also the filtering method that used the man-made profanity dictionary, which is a widely-used method to detect explicit contents in the industry. The proposed automated lyrics screening approach makes practical contributions to music industry, helping it significantly save time and effort for censoring harmful contents for the youths. The proposed approach is generalizable to other language settings as long as the same kinds of data used in the study are available.

引用

页码：517 / 521

页数：5

共 12 条

[1] HEAVY-METAL MUSIC AND RECKLESS BEHAVIOR AMONG ADOLESCENTS [J].

ARNETT, J .

JOURNAL OF YOUTH AND ADOLESCENCE, 1991, 20 (06) :573-592

[2] The immediate effects of homicidal, suicidal, and nonviolent heavy metal and rap songs on the moods of college students [J].

Ballard, ME ;

Coates, S .

YOUTH & SOCIETY, 1995, 27 (02) :148-168

[3]

Bloehdorn Stephan., 2004, Boosting for text classification with semantic features

[4]

Chen Y., 2012, PRIV SEC RISK TRUST

[5] Text Censoring System for Filtering Malicious Content Using Approximate String Matching and Bayesian Filtering [J].

Ghauth, Khairil Imran ;

Sukhur, Muhammad Shurazi .

COMPUTATIONAL INTELLIGENCE IN INFORMATION SYSTEMS, 2015, 331 :149-158

[6]

Jaehwan Lee, 2016, KIISE Transactions on Computing Practices, V22, P479, DOI 10.5626/KTCP.2016.22.10.479

[7]

Kwon Ji-Yeon, 2015, [The Journal of the Korea Contents Association, 한국콘텐츠학회 논문지], V15, P68, DOI 10.5392/JKCA.2015.15.03.068

[8] Bagging and boosting classification trees to predict churn [J].

Lemmens, A ;

Croux, C .

JOURNAL OF MARKETING RESEARCH, 2006, 43 (02) :276-286

[9]

Nobata Chikashi, 2016, P 25 INT C WORLD WID

[10]

Kim Sungoan, 2013, [Journal of The Korea Society of Computer and Information, 한국컴퓨터정보학회논문지], V18, P43

← 1 2 →