Acoustic data augmentation for Mandarin-English code-switching speech recognition. (April 2020)
- Record Type:
- Journal Article
- Title:
- Acoustic data augmentation for Mandarin-English code-switching speech recognition. (April 2020)
- Main Title:
- Acoustic data augmentation for Mandarin-English code-switching speech recognition
- Authors:
- Long, Yanhua
Li, Yijie
Zhang, Qiaozheng
Wei, Shuang
Ye, Hong
Yang, Jichen - Abstract:
- Highlights: Three acoustic data augmentation methods were studied for the code-switching ASR. A CS detection system was proposed to extract real code-switched speech segments. Semi-supervised and active learning were investigated to produce the CS transcriptions. The proposed approach outperforms the conventional techniques significantly. The effectiveness of synthesized code-switched utterances was very limited. Abstract: Code-switching (CS) is a multilingual phenomenon where a speaker uses different languages in an utterance or between alternating utterances. Developing large-scale datasets for training code-switching acoustic and language models is challenging and extremely expensive. In this paper, we focus on the acoustic data augmentation for the Mandarin-English CS speech recognition task. Effectiveness of conventional acoustic data augmentation approaches are examined. More importantly, we propose a CS acoustic event detection system based on the deep neural network to extract real code-switching speech segments automatically. Then, the semi-supervised and active learning techniques are investigated to generate transcriptions of these segments. Finally, code-switching speech synthesis system is introduced to further enhance the acoustic modeling. Experimental results on the OC16-CE80 data, a Mandarin-English mixlingual speech corpus, demonstrate the effectiveness of the proposed methods.
- Is Part Of:
- Applied acoustics. Volume 161(2020)
- Journal:
- Applied acoustics
- Issue:
- Volume 161(2020)
- Issue Display:
- Volume 161, Issue 2020 (2020)
- Year:
- 2020
- Volume:
- 161
- Issue:
- 2020
- Issue Sort Value:
- 2020-0161-2020-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-04
- Subjects:
- Data augmentation -- Code-switching -- Acoustic event detection -- Speech recognition
Acoustical engineering -- Periodicals
Periodicals
620.2 - Journal URLs:
- http://www.sciencedirect.com/science/journal/0003682X ↗
http://www.elsevier.com/journals ↗
http://www.elsevier.com/homepage/elecserv.htt ↗ - DOI:
- 10.1016/j.apacoust.2019.107175 ↗
- Languages:
- English
- ISSNs:
- 0003-682X
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 1571.400000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12856.xml