HOW SOUND AFFECTS VISUAL ATTENTION IN OMNIDIRECTIONAL VIDEOS

被引:5
作者
Li, Jie [1 ]
Zhai, Guangtao [1 ]
Zhu, Yucheng [1 ]
Zhou, Jun [1 ]
Zhang, Xiao-Ping [2 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[2] Ryerson Univ, Toronto, ON, Canada
来源
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP | 2022年
基金
中国博士后科学基金; 国家重点研发计划; 中国国家自然科学基金;
关键词
Audio-visual saliency; dataset; eye movement; omnidirectional videos; SALIENCY;
D O I
10.1109/ICIP46576.2022.9897737
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a new audio-visual attention dataset that records eye movement for omnidirectional videos with and without sound. We classify the videos into three types according to the number of salient objects and sound sources and analyze the impact of sound on visual attention distribution and inter-observer consistency of viewing area in different types of videos. From the quantitative and qualitative analysis, we find that visual attention will be drawn to and concentrated on the sound source with the presence of sound, especially when there are several visually salient objects and only one sound source. Also, the sound will enhance the consistency of observation areas among viewers to some extent. For more investigations on the impact of sound on visual attention and prospective audio-visual saliency model, we still need further study.
引用
收藏
页码:3066 / 3070
页数:5
相关论文
共 23 条
[1]   A Saliency Dataset for 360-Degree Videos [J].
Anh Nguyen ;
Yan, Zhisheng .
PROCEEDINGS OF THE 10TH ACM MULTIMEDIA SYSTEMS CONFERENCE (ACM MMSYS'19), 2019, :279-284
[2]   Salient Object Detection: A Benchmark [J].
Borji, Ali ;
Sihite, Dicky N. ;
Itti, Laurent .
COMPUTER VISION - ECCV 2012, PT II, 2012, 7573 :414-429
[3]   What Do Different Evaluation Metrics Tell Us About Saliency Models? [J].
Bylinskii, Zoya ;
Judd, Tilke ;
Oliva, Aude ;
Torralba, Antonio ;
Durand, Fredo .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (03) :740-757
[4]  
Chao F.-Y., 2020, IEEE INT CONF MULTI, P1, DOI [DOI 10.1109/icmew46912.2020.9105956, 10.1109/ICMEW46912.2020.9105956]
[5]   Towards Audio-Visual Saliency Prediction for Omnidirectional Video with Spatial Audio [J].
Chao, Fang-Yi ;
Ozcinar, Cagri ;
Zhang, Lu ;
Hamidouche, Wassim ;
Deforges, Olivier ;
Smolic, Aljosa .
2020 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2020, :355-358
[6]  
Cornia M, 2016, INT C PATT RECOG, P3488, DOI 10.1109/ICPR.2016.7900174
[7]  
Fuchs A.F., 1971, CONTROL EYE MOVEMENT, P343, DOI DOI 10.1016/B978-0-12-071050-8.50017-3
[8]   Toolbox and dataset for the development of saliency and scanpath models for omnidirectional/360° still images [J].
Gutierrez, Jesus ;
David, Erwan ;
Rai, Yashas ;
Le Callet, Patrick .
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2018, 69 :35-42
[9]  
HARUTA R, 1985, Nippon Ganka Gakkai Zasshi, V89, P907
[10]  
Huang XG, 2015, 2015 IEEE International Conference on Applied Superconductivity and Electromagnetic Devices (ASEMD), P262, DOI 10.1109/ASEMD.2015.7453564