Acoustic data augmentation for Mandarin-English code-switching speech recognition. (April 2020)

Record Type:: Journal Article
Title:: Acoustic data augmentation for Mandarin-English code-switching speech recognition. (April 2020)
Main Title:: Acoustic data augmentation for Mandarin-English code-switching speech recognition
Authors:: Long, Yanhua
Li, Yijie
Zhang, Qiaozheng
Wei, Shuang
Ye, Hong
Yang, Jichen
Abstract:: Highlights: Three acoustic data augmentation methods were studied for the code-switching ASR. A CS detection system was proposed to extract real code-switched speech segments. Semi-supervised and active learning were investigated to produce the CS transcriptions. The proposed approach outperforms the conventional techniques significantly. The effectiveness of synthesized code-switched utterances was very limited. Abstract: Code-switching (CS) is a multilingual phenomenon where a speaker uses different languages in an utterance or between alternating utterances. Developing large-scale datasets for training code-switching acoustic and language models is challenging and extremely expensive. In this paper, we focus on the acoustic data augmentation for the Mandarin-English CS speech recognition task. Effectiveness of conventional acoustic data augmentation approaches are examined. More importantly, we propose a CS acoustic event detection system based on the deep neural network to extract real code-switching speech segments automatically. Then, the semi-supervised and active learning techniques are investigated to generate transcriptions of these segments. Finally, code-switching speech synthesis system is introduced to further enhance the acoustic modeling. Experimental results on the OC16-CE80 data, a Mandarin-English mixlingual speech corpus, demonstrate the effectiveness of the proposed methods.
Is Part Of:: Applied acoustics. Volume 161(2020)
Journal:: Applied acoustics
Issue:: Volume 161(2020)
Issue Display:: Volume 161, Issue 2020 (2020)
Year:: 2020
Volume:: 161
Issue:: 2020
Issue Sort Value:: 2020-0161-2020-0000
Page Start:
Page End:
Publication Date:: 2020-04
Subjects:: Data augmentation -- Code-switching -- Acoustic event detection -- Speech recognition
Acoustical engineering -- Periodicals
Periodicals
620.2
Journal URLs:: http://www.sciencedirect.com/science/journal/0003682X ↗
http://www.elsevier.com/journals ↗
http://www.elsevier.com/homepage/elecserv.htt ↗
DOI:: 10.1016/j.apacoust.2019.107175 ↗
Languages:: English
ISSNs:: 0003-682X
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 1571.400000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 12856.xml