End-to-end neural systems for automatic children speech recognition: An empirical study. (March 2022)

Record Type:: Journal Article
Title:: End-to-end neural systems for automatic children speech recognition: An empirical study. (March 2022)
Main Title:: End-to-end neural systems for automatic children speech recognition: An empirical study
Authors:: Gurunath Shivakumar, Prashanth
Narayanan, Shrikanth
Abstract:: Abstract: A key desiderata for inclusive and accessible speech recognition technology is ensuring its robust performance to children's speech. Notably, this includes the rapidly advancing neural network based end-to-end speech recognition systems. Children speech recognition is more challenging due to the larger intra-inter speaker variability in terms of acoustic and linguistic characteristics compared to adult speech. Furthermore, the lack of adequate and appropriate children speech resources adds to the challenge of designing robust end-to-end neural architectures. This study provides a critical assessment of automatic children speech recognition through an empirical study of contemporary state-of-the-art end-to-end speech recognition systems. Insights are provided on the aspects of training data requirements, adaptation on children data, and the effect of children age, utterance lengths, different architectures and loss functions for end-to-end systems and role of language models on the speech recognition performance. Highlights: Empirical study of end-to-end deep learning based children speech ASR Training/Adaptation comparisons for various adult and children speech dataset sizes Acoustic Model Comparison: DNN-HMM hybrid vs. End-to-End ASR DNN Architecture Comparison: ResNets vs. Time-depth separable CNN vs. Transformers Loss Function Comparison: Connectionist Temporal Classification vs. Seq-to-Seq Language Model Comparison: 4gram vs. 6gram vs. Gated-CNN LM; word vs. … (more)
Is Part Of:: Computer speech & language. Volume 72(2022)
Journal:: Computer speech & language
Issue:: Volume 72(2022)
Issue Display:: Volume 72, Issue 2022 (2022)
Year:: 2022
Volume:: 72
Issue:: 2022
Issue Sort Value:: 2022-0072-2022-0000
Page Start:
Page End:
Publication Date:: 2022-03
Subjects:: Children speech recognition -- End-to-end speech recognition -- Residual network -- Time depth separable convolutional network -- Transformer
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2021.101289 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 20051.xml