TIMBRAL MODELING FOR MUSIC ARTIST RECOGNITION USING I-VECTORS

被引：0

作者：

Eghbal-zadeh, Hamid ^{[1
]}

Schedl, Markus ^{[1
]}

Widmer, Gerhard ^{[1
]}

机构：

[1] Johannes Kepler Univ Linz, Dept Computat Percept, A-4040 Linz, Austria

来源：

2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO) | 2015年

基金：

奥地利科学基金会;

关键词：

music artist recognition; timbral modeling; song-level features; i-vectors; mfcc;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Music artist (i.e., singer) recognition is a challenging task in Music Information Retrieval (MIR). The presence of different musical instruments, the diversity of music genres and singing techniques make the retrieval of artist-relevant information from a song difficult. Many authors tried to address this problem by using complex features or hybrid systems. In this paper, we propose new song-level timbre-related features that are built from frame-level IVIFCCs via so-called i-vectors. We report artist recognition results with multiple classifiers such as K-nearest neighbor, Discriminant Analysis and Naive Bayer using these new features. Our approach yields considerable improvements and outperforms existing methods. We could achieve an 84.31% accuracy using MFCC features on a 20-classes artist recognition task.

引用

页码：1286 / 1290

页数：5

共 42 条

[21] Duration compensation of i-vectors for short duration speaker verification
Ma, Jianbo
Sethu, Vidhyasaharan
Ambikairajah, Eliathamby
Lee, Kong Aik
ELECTRONICS LETTERS, 2017, 53 (06) : 405 - 407
[22] Auto-Encoding Nearest Neighbor i-vectors for Speaker Verification
Khan, Umair
India, Miquel
Hernando, Javier
INTERSPEECH 2019, 2019, : 4060 - 4064
[23] I-Vectors and Structured Neural Networks for Rapid Adaptation of Acoustic Models
Karanasou, Penny
Wu, Chunyang
Gales, Mark
Woodland, Philip C.
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (04) : 818 - 828
[24] Improving Deep Neural Networks Based Multi-Accent Mandarin Speech Recognition Using I-Vectors and Accent-Specific Top layer
Chen, Mingming
Yang, Zhanlei
Liang, Jizhong
Li, Yanpeng
Liu, Wenju
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3620 - 3624
[25] I-vectors and ILP clustering adapted to cross-show speaker diarization
Dupuy, Gregor
Rouvier, Mickael
Meignier, Sylvain
Esteve, Yannick
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2171 - 2174
[26] Automatic Evaluation of Speech Intelligibility Based on i-vectors in the Context of Head and Neck Cancers
Laaridh, Imed
Fredouille, Corinne
Ghio, Alain
Lalain, Muriel
Woisard, Virginie
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2943 - 2947
[27] SPEAKER DIARIZATION OF BROADCAST STREAMS USING TWO-STAGE CLUSTERING BASED ON I-VECTORS AND COSINE DISTANCE SCORING
Silovsky, Jan
Prazak, Jan
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4193 - 4196
[28] CNN-based joint mapping of short and long utterance i-vectors for speaker verification using short utterances
Guo, Jinxi
Nookala, Usha Amrutha
Alwan, Abeer
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3712 - 3716
[29] VARIABILITY COMPENSATION IN SMALL DATA: OVERSAMPLED EXTRACTION OF I-VECTORS FOR THE CLASSIFICATION OF DEPRESSED SPEECH
Cummins, Nicholas
Epps, Julien
Sethu, Vidhyasaharan
Krajewski, Jarek
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[30] Incorporation of discriminative n-grams to improve a phonotactic language recognizer based on i-vectors
Salamea Palaciosi, Christian
Fernando D'Haro, Luis
Cordoba, Ricardo
Angel Caraballo, Miguel
PROCESAMIENTO DEL LENGUAJE NATURAL, 2013, (51): : 145 - 152

← 1 2 3 4 5 →