Coarse-to-Fine Image Aesthetics Assessment With Dynamic Attribute Selection

被引:6
作者
Huang, Yipo [1 ]
Li, Leida [1 ]
Chen, Pengfei [1 ]
Wu, Jinjian [1 ]
Yang, Yuzhe [2 ]
Li, Yaqian [2 ]
Shi, Guangming [1 ]
机构
[1] Xidian Univ, Sch Artificial Intelligence, Xian 710071, Peoples R China
[2] OPPO Res Inst, Intelligent Percept & Interact Res Dept, Shanghai 200032, Peoples R China
基金
中国国家自然科学基金;
关键词
Predictive models; Image color analysis; Task analysis; Visualization; Feature extraction; Databases; Merging; Aesthetic attributes; image aesthetics assessment; interaction; model explainability; QUALITY;
D O I
10.1109/TMM.2024.3389452
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Image aesthetics assessment (IAA) is an interesting but challenging task, owing to the ineffable nature of human sense of beauty. The study of IAA has evolved from simple binary classification to more complex score regression and distribution prediction. It is effortless for people to perform aesthetic binary classification, i.e., aesthetically pleasing or not. However, further judgment on the fine-level scalar aesthetic score is complex and typically determined by aesthetic attributes presented in the image, such as content, lighting and color. Motivated by the above facts, this paper presents a Coarse-to-fine image Aesthetics assessment model guided by Dynamic Attribute Selection, dubbed CADAS. The underlying idea is to simulate the process of human aesthetic perception by performing coarse-to-fine aesthetic reasoning. Specifically, a hierarchical AttributeNet is first pre-trained by imitating the staged mechanism of human aesthetic experience, producing the candidate aesthetic attributes. Then, an AestheticNet is introduced to perform the coarse-level binary classification, based on which a confidence-based attribute selection strategy is designed to dynamically pick out the dominant aesthetic attributes from the candidate ones. Finally, a self-attention-based FusionNet is designed to explore the interaction between dominant aesthetic attributes and aesthetic features, producing the fine-level aesthetic prediction. Extensive experiments demonstrate that the proposed model is superior to the state-of-the-arts. Furthermore, CADAS is also able to output the dominant aesthetic attributes in images, facilitating model explainability.
引用
收藏
页码:9316 / 9329
页数:14
相关论文
共 60 条
[1]   Composition and Style Attributes Guided Image Aesthetic Assessment [J].
Celona, Luigi ;
Leonardi, Marco ;
Napoletano, Paolo ;
Rozza, Alessandro .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 :5009-5024
[2]   Adaptive Fractional Dilated Convolution Network for Image Aesthetics Assessment [J].
Chen, Qiuyu ;
Zhang, Wei ;
Zhou, Ning ;
Lei, Peng ;
Xu, Yi ;
Zheng, Yu ;
Fan, Jianping .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :14102-14111
[3]  
Chen ZH, 2020, Journal of Applied Mathematics and Physics, V08, P2869, DOI 10.4236/jamp.2020.812212
[4]   Distribution-Oriented Aesthetics Assessment With Semantic-Aware Hybrid Network [J].
Cui, Chaoran ;
Liu, Huihui ;
Lian, Tao ;
Nie, Liqiang ;
Zhu, Lei ;
Yin, Yilong .
IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (05) :1209-1220
[5]  
Datta R, 2006, LECT NOTES COMPUT SC, V3953, P288, DOI 10.1007/11744078_23
[6]   Image Aesthetic Assessment An experimental survey [J].
Deng, Yubin ;
Loy, Chen Change ;
Tang, Xiaoou .
IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (04) :80-106
[7]   Image Quality Assessment: From Mean Opinion Score to Opinion Score Distribution [J].
Gao, Yixuan ;
Min, Xiongkuo ;
Zhu, Yucheng ;
Li, Jing ;
Zhang, Xiao-Ping ;
Zhai, Guangtao .
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, :997-1005
[8]   Multigranular Event Recognition of Persona Photo Albums [J].
Guo, Cong ;
Tian, Xinmei ;
Mei, Tao .
IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (07) :1837-1847
[9]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[10]   Effective Aesthetics Prediction with Multi-level Spatially Pooled Features [J].
Hosu, Vlad ;
Goldluecke, Bastian ;
Saupe, Dietmar .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9367-9375