Hierarchical Multi-Label Attribute Classification With Graph Convolutional Networks on Anime Illustration

被引：0

作者：

Lan, Ziwen ^{[1
]}

Maeda, Keisuke ^{[2
]}

Ogawa, Takahiro ^{[2
]}

Haseyama, Miki ^{[2
]}

机构：

[1] Hokkaido Univ, Grad Sch Informat Sci & Technol, Sapporo, Japan

[2] Hokkaido Univ, Fac Informat Sci & Technol, Sapporo, Japan

来源：

IEEE ACCESS | 2023年 / 11卷

基金：

日本学术振兴会;

关键词：

Task analysis; Semantics; Image classification; Correlation; Convolutional neural networks; Visualization; Context modeling; Hierarchical classification; anime illustration; attribute classification; graph convolutional networks; image captioning;

D O I：

10.1109/ACCESS.2023.3265728

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this study, we present a hierarchical multi-modal multi-label attribute classification model for anime illustrations using graph convolutional networks (GCNs). The focus of this study is multi-label attribute classification, as creators of anime illustrations frequently and deliberately emphasize subtle features of characters and objects. To analyze the connections between attributes, we develop a multi-modal GCN-based model that can use semantic features of anime illustrations. To create features representing the semantic information of anime illustrations, we construct a novel captioning framework by combining real-world images with their animated style transformations. In addition, because the attributes of anime illustrations are hierarchical, we introduce a loss function that considers the hierarchy of attributes to improve classification accuracy. The proposed method has two main contributions: 1) By introducing a GCN with semantic features into the multi-label attribute classification task of anime illustrations, we capture more comprehensive relationships between attributes. 2) By following certain rules to build a hierarchical structure of attributes that appear frequently in anime illustrations, we further capture subordinate relationships between attributes. In addition, we demonstrate the effectiveness of the proposed method by experiments.

引用

页码：35447 / 35456

页数：10

共 34 条

[1]

[Anonymous], 2010, European Conference on Computer Vision

[2]

Back J, 2022, Arxiv, DOI arXiv:2210.10335

[3]

Back Jihye, 2021, arXiv

[4]

Banik S, 2018, Arxiv, DOI arXiv:1811.04309

[5] Learning Semantic-Specific Graph Representation for Multi-Label Image Recognition [J].

Chen, Tianshui ;

Xu, Muxin ;

Hui, Xiaolu ;

Wu, Hefeng ;

Lin, Liang .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :522-531

[6] Learning Graph Convolutional Networks for Multi-Label Recognition and Applications [J].

Chen, Zhao-Min ;

Wei, Xiu-Shen ;

Wang, Peng ;

Guo, Yanwen .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (06) :6969-6983

[7] Multi-Label Image Recognition with Graph Convolutional Networks [J].

Chen, Zhao-Min ;

Wei, Xiu-Shen ;

Wang, Peng ;

Guo, Yanwen .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5172-5181

[8]

Deng P., 2020, PROC AAAI C ARTIF IN, P1

[9]

Fellbaum C, 1998, LANG SPEECH & COMMUN, P1

[10]

Ge ZY, 2018, Arxiv, DOI arXiv:1807.07247

← 1 2 3 4 →