Are acoustics enough? Semantic effects on auditory salience in natural scenes

被引:1
|
作者
Kothinti, Sandeep Reddy [1 ]
Elhilali, Mounya [1 ]
机构
[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Dept Elect & Comp Engn, Baltimore, MD 21218 USA
来源
FRONTIERS IN PSYCHOLOGY | 2023年 / 14卷
关键词
auditory salience; auditory attention; audio event detection; bottom-up attention; auditory perception; VISUAL-ATTENTION; BEHAVIORAL-EXPERIMENTS; DISTRACTION; CAPTURE; MECHANISMS; ALLOCATION; OBJECTS; ONSETS; SHIFTS; BRAIN;
D O I
10.3389/fpsyg.2023.1276237
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Auditory salience is a fundamental property of a sound that allows it to grab a listener's attention regardless of their attentional state or behavioral goals. While previous research has shed light on acoustic factors influencing auditory salience, the semantic dimensions of this phenomenon have remained relatively unexplored owing both to the complexity of measuring salience in audition as well as limited focus on complex natural scenes. In this study, we examine the relationship between acoustic, contextual, and semantic attributes and their impact on the auditory salience of natural audio scenes using a dichotic listening paradigm. The experiments present acoustic scenes in forward and backward directions; the latter allows to diminish semantic effects, providing a counterpoint to the effects observed in forward scenes. The behavioral data collected from a crowd-sourced platform reveal a striking convergence in temporal salience maps for certain sound events, while marked disparities emerge in others. Our main hypothesis posits that differences in the perceptual salience of events are predominantly driven by semantic and contextual cues, particularly evident in those cases displaying substantial disparities between forward and backward presentations. Conversely, events exhibiting a high degree of alignment can largely be attributed to low-level acoustic attributes. To evaluate this hypothesis, we employ analytical techniques that combine rich low-level mappings from acoustic profiles with high-level embeddings extracted from a deep neural network. This integrated approach captures both acoustic and semantic attributes of acoustic scenes along with their temporal trajectories. The results demonstrate that perceptual salience is a careful interplay between low-level and high-level attributes that shapes which moments stand out in a natural soundscape. Furthermore, our findings underscore the important role of longer-term context as a critical component of auditory salience, enabling us to discern and adapt to temporal regularities within an acoustic scene. The experimental and model-based validation of semantic factors of salience paves the way for a complete understanding of auditory salience. Ultimately, the empirical and computational analyses have implications for developing large-scale models for auditory salience and audio analytics.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] SOME EFFECTS OF ROOM ACOUSTICS ON EVOKED AUDITORY POTENTIALS
    MARSH, JT
    HICKS, L
    WORDEN, FG
    SCIENCE, 1962, 137 (3526) : 280 - &
  • [22] Affective salience can reverse the effects of stimulus-driven salience on eye movements in complex scenes
    Niu, Yaqing
    Todd, Rebecca M.
    Anderson, A. K.
    FRONTIERS IN PSYCHOLOGY, 2012, 3
  • [23] Camera Orientation Estimation in Natural Scenes Using Semantic Cues
    Brejcha, Jan
    Cadik, Martin
    2018 INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2018, : 208 - 217
  • [24] Semantic modeling of natural scenes based on contextual Bayesian networks
    Cheng, Huanhuan
    Wang, Runsheng
    PATTERN RECOGNITION, 2010, 43 (12) : 4042 - 4054
  • [25] Effects of alcohol on categorization of natural scenes
    Codispoti, M
    De Cesarei, A
    Salvatelli, S
    Stegagno, L
    PSYCHOPHYSIOLOGY, 2002, 39 : S26 - S26
  • [26] EFFECTS OF SEMANTIC ORGANIZATION ON HEMISPHERIC RECOGNITION OF PICTORIAL SCENES
    ZAIDEL, DW
    RAUSCH, R
    INTERNATIONAL JOURNAL OF NEUROSCIENCE, 1981, 12 (3-4) : 238 - 238
  • [27] Templates for rejection can specify semantic properties of nontargets in natural scenes
    Daffron, Jennifer L.
    Davis, Greg
    JOURNAL OF VISION, 2015, 15 (15):
  • [28] Semantic Modeling of Natural Scenes for Content-Based Image Retrieval
    Julia Vogel
    Bernt Schiele
    International Journal of Computer Vision, 2007, 72 : 133 - 157
  • [29] Semantic modeling of natural scenes for content-based image retrieval
    Vogel, Julia
    Schiele, Bernt
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2007, 72 (02) : 133 - 157
  • [30] The shift of attention: Salience modulates the local vs global processing of auditory scenes in musicians and non-musicians
    Bouvier, Baptiste
    Susini, Patrick
    Ponsot, Emmanuel
    JASA EXPRESS LETTERS, 2025, 5 (01):