Accuracy comparisons of fingerprint based song recognition approaches using very high granularity

被引：0

作者：

Salvatore Serrano

Marco Scarpa

机构：

[1] University of Messina,Department of Engineering

来源：

Multimedia Tools and Applications | 2023年 / 82卷

关键词：

Song recognition; Audio fingerprint; Power spectral density; Hamming distance; Binary fingerprints;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Music and song recognition is an activity of wide interest for researchers and companies due to the intrinsic challenges and the possible economical profits it can give. Despite basic algorithms about song recognition are simple in principle, it is quite difficult to obtain an efficient and robust approach able to generate an effective algorithm for identifying short piece of audio on the fly. In this paper, we compare the results obtained using a new algorithm we recently proposed against several baseline approaches in terms of accuracy when very short pieces of audio are processed. Experimental results, performed using both a subset of the MTG-Jamendo dataset and a proprietary audio corpus containing 7000 songs, show our approach outperform the others in particular for excerpts of audio shorter than 3s.

引用

页码：31591 / 31606

页数：15

共 58 条

[1] Báez-Suárez A(2020)SAMAF: sequence-to-sequence autoencoder model for audio fingerprinting ACM Transactions on Multimedia Computing Communications, and Applications (TOMM) 16 1-23
[2] Shah N(2013)Representation learning: a review and new perspectives IEEE Trans Pattern Anal Mach Intell 35 1798-1828
[3] Nolazco-Flores JA(2005)A review of audio fingerprinting Journal of VLSI Signal Processing Systems for Signal, Image and Video Technology 41 271-284
[4] Huang S-HS(2020)Audio fingerprint database structure using k-modes clustering Journal of Advanced Research in Dynamical and Control Systems 12 1545-1554
[5] Gnawali O(2016)On effective location-aware music recommendation ACM Transactions on Information Systems (TOIS) 34 1-32
[6] Shi W(2010)A survey of audio-based music classification and annotation IEEE Trans Multimedia 13 303-319
[7] Bengio Y(2020)Audio fingerprint extraction based on locally linear embedding for audio retrieval system Electronics 9 1483-24
[8] Courville A(2019)Cm-gans: cross-modal generative adversarial networks for common representation learning ACM Transactions on Multimedia Computing Communications, and Applications (TOMM) 15 1-1544
[9] Vincent P(2020)Audio fingerprint based on power spectral density and hamming distance measure Journal of Advanced Research in Dynamical and Control Systems 12 1533-31
[10] Cano P(2009)A novel framework for efficient automated singer identification in large music databases ACM Transactions on Information Systems (TOIS) 27 1-1189

← 1 2 3 4 5 6 →