共 13 条
[1]
End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
[J].
INTERSPEECH 2022,
2022,
:3819-3823
[2]
A post-processing system to yield reduced word error rates: Recognizer output voting error reduction (ROVER)
[J].
1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS,
1997,
:347-354
[3]
HUBERT: HOW MUCH CAN A BAD TEACHER BENEFIT ASR PRE-TRAINING?
[J].
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021),
2021,
:6533-6537
[5]
Luo Y, 2020, INT CONF ACOUST SPEE, P46, DOI [10.1109/ICASSP40776.2020.9054266, 10.1109/icassp40776.2020.9054266]
[6]
Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
[J].
INTERSPEECH 2020,
2020,
:274-278
[7]
Raj D, 2023, Arxiv, DOI arXiv:2212.05271
[8]
DOVER-LAP: A METHOD FOR COMBINING OVERLAP-AWARE DIARIZATION OUTPUTS
[J].
2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT),
2021,
:881-888
[9]
Tian JG, 2022, Arxiv, DOI arXiv:2202.04814
[10]
CROSS-CHANNEL ATTENTION-BASED TARGET SPEAKER VOICE ACTIVITY DETECTION: EXPERIMENTAL RESULTS FOR THE M2MET CHALLENGE
[J].
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP),
2022,
:9171-9175