Combining Text and Images for Film Age Appropriateness Classification

被引：0

作者：

Ha, Le An ^{[1
]}

Mohamed, Emad ^{[1
]}

机构：

[1] Univ Wolverhampton, Res Inst Informat & Language Proc, Wulfruna St, Wolverhampton WV1 1LY, England

来源：

AI IN COMPUTATIONAL LINGUISTICS | 2021年 / 189卷

关键词：

age appropriateness classification; deep learning; bi-modal classification; image classification; text classification;

D O I：

10.1016/j.procs.2021.05.087

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We combine textual information from a corpus of film scripts and the images of important scenes from IMDB that correspond to these films to create a bimodal dataset (the dataset and scripts can be obtained from http://tinyurl.com/se9tlmr) for film age appropriateness classification with the objective of improving the prediction of age appropriateness for parents and children. We use state-of-the art Deep Learning image feature extraction, including DENSENet, ResNet, Inception, and NASNet. We have tested several Machine learning algorithms and have found xgboost to yield the best results. Previously reported classification accuracy, using only textual features, were 79.1% and 65.3% for American MPAA and British BBFC classification respectively. Using images alone, we achieve 64.8% and 56.7% classification accuracy. The most consistent combination of textual features and images' features achieves 81.1% and 66.8%, both statistically significant improvements over the use of text only. (C) 2021 The Authors. Published by Elsevier B.V.

引用

页码：242 / 249

页数：8

共 18 条

[1]

Ailem M, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P1478

[2]

[Anonymous], 2017, IN PRESS

[3]

[Anonymous], 2015, P 4 WORKSHOP VISION

[4]

Chen JQ, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P4046

[5]

Chen X, 2015, PROC CVPR IEEE, P2422, DOI 10.1109/CVPR.2015.7298856

[6]

Fang H, 2015, PROC CVPR IEEE, P1473, DOI 10.1109/CVPR.2015.7298754

[7] Identity Mappings in Deep Residual Networks [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :630-645

[8] Densely Connected Convolutional Networks [J].

Huang, Gao ;

Liu, Zhuang ;

van der Maaten, Laurens ;

Weinberger, Kilian Q. .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2261-2269

[9]

Martinez VR, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P4780

[10]

Martinez VR, 2019, AAAI CONF ARTIF INTE, P671

← 1 2 →