Data reduction of audio by exploiting musical repetition

被引：0

作者：

Stuart Cunningham

Vic Grout

机构：

[1] Glyndŵr University,Creative and Applied Research for the Information Society (CARDS)

来源：

Multimedia Tools and Applications | 2014年 / 72卷

关键词：

Audio; Music; Compression; Repetition; Perceptual coding;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

This paper presents and evaluates a method of audio compression specifically designed to exploit the natural repetition that occurs within musical audio. Our system is entitled Audio Compression Exploiting Repetition (ACER). ACER is a perceptual technique, but one that does not consider exploiting masking, but rather attempts to apply the principles of Lempel-Ziv and run-length encoding, by substituting audio sequences for numeric or character strings. The ACER procedure applies a pseudo exhaustive search process and spectral difference grading. Since ACER exploits musical structure, the amount of data reduction achieved varies from piece-to-piece. The system is described before results on a corpus of material are presented. The analysis shows moderate amounts of data reduction take place whilst the system is operating within parameters designed to maintain high-levels of perceptual audio quality, whilst lower rates of perceptual quality yield greater data reduction. Objective quality evaluations are conducted that reveal degradation in fidelity that is relative to the compression parameters.

引用

页码：2299 / 2320

页数：21

共 33 条

[1] Aucouturier J-J(2005)“The Way It Sounds”: timbre models for analysis and retrieval of music signals IEEE Trans Multimedia 6 1028-1035
[2] Pachet F(2011)Measuring structural similarity in music IEEE Trans AudioSpeech Lang Process 7 2013-2025
[3] Sandler M(2011)Unifying low-level and high-level music similarity measures IEEE Trans Multimedia 4 687-701
[4] Bello JP(2008)Co-clustering for auditory scene categorization IEEE Trans Multimedia 4 596-606
[5] Bogdanov D(2008)Unified view of prediction and repetition structure in audio signals with application to interest point detection IEEE Trans AudioSpeech Lang Process 16 327-337
[6] Serrà J(2003)A quick search method for audio and video signals based on histogram pruning IEEE Trans Multimedia 3 348-357
[7] Wack N(2007)Generalized Lempel–Ziv compression for audio IEEE Trans AudioSpeech Lang Process 15 509-518
[8] Herrera P(2008)Efficient index-based audio matching IEEE Trans AudioSpeech Lang Process 2 382-395
[9] Serra X(2010)Spectral similarity metrics for sound source formation based on the common variation cue Multimedia Tools Appl 1 185-205
[10] Cai R(2013)A robust fitness measure for capturing repetitions in music recordings with applications to audio thumbnailing IEEE Trans AudioSpeech Lang Process 3 531-543

← 1 2 3 4 →