Hybrid-task learning for robust automatic speech recognition. (November 2020)

Record Type:: Journal Article
Title:: Hybrid-task learning for robust automatic speech recognition. (November 2020)
Main Title:: Hybrid-task learning for robust automatic speech recognition
Authors:: Pironkov, Gueorgui
Wood, Sean UN
Dupont, Stéphane
Abstract:: Abstract: In order to properly train an automatic speech recognition system, speech with its annotated transcriptions is most often required. The amount of real annotated data recorded in noisy and reverberant conditions is extremely limited, especially compared to the amount of data than can be simulated by adding noise to clean annotated speech. Thus, using both real and simulated data is important in order to improve robust speech recognition, as this increases the amount and diversity of training data (thanks to the simulated data) while also benefiting from a reduced mismatch between training and operation of the system (thanks to the real data). Another promising method applied to speech recognition in noisy and reverberant conditions is multi-task learning. The idea is to train one acoustic model to solve simultaneously at least two tasks that are different but related, with speech recognition being the main task. A successful auxiliary task consists of generating clean speech features using a regression loss (as a denoising auto-encoder). This auxiliary task though uses as targets clean speech, which implies that real data cannot be used. In order to tackle this problem a Hybrid-Task Learning system is proposed. This system switches frequently between multi and single-task learning depending on whether the input is real or simulated data respectively. Having a hybrid architecture allows us to benefit from both real and simulated data while using a denoising … (more)
Is Part Of:: Computer speech & language. Volume 64(2020)
Journal:: Computer speech & language
Issue:: Volume 64(2020)
Issue Display:: Volume 64, Issue 2020 (2020)
Year:: 2020
Volume:: 64
Issue:: 2020
Issue Sort Value:: 2020-0064-2020-0000
Page Start:
Page End:
Publication Date:: 2020-11
Subjects:: Multi-task learning -- Robust speech recognition -- Hybrid-task learning -- Denoising auto-Encoder -- Real & simulated data training -- Noise & reverberation -- CHiME4
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2020.101103 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 13431.xml