Arabic speech recognition by end-to-end, modular systems and human. (January 2022)
- Record Type:
- Journal Article
- Title:
- Arabic speech recognition by end-to-end, modular systems and human. (January 2022)
- Main Title:
- Arabic speech recognition by end-to-end, modular systems and human
- Authors:
- Hussein, Amir
Watanabe, Shinji
Ali, Ahmed - Abstract:
- Abstract: Recent advances in automatic speech recognition (ASR) have achieved accuracy levels comparable to human transcribers, which led researchers to debate if the machine has reached human performance. Previous work focused on the English language and modular hidden Markov model-deep neural network (HMM–DNN) systems. In this paper, we perform a comprehensive benchmarking for end-to-end transformer ASR, modular HMM–DNN ASR, and human speech recognition (HSR) on the Arabic language and its dialects. For the HSR, we evaluate linguist performance and lay-native speaker performance on a new dataset collected as a part of this study. For ASR the end-to-end work led to 12.5%, 27.5%, 33.8% WER; a new performance milestone for the MGB2, MGB3, and MGB5 challenges respectively. Our results suggest that human performance in the Arabic language is still considerably better than the machine with an absolute WER gap of 3.5% on average. Highlights: Comprehensive benchmarking for end-to-end, modular, and human on the Arabic language. New milestone in Arabic speech recognition with E2E transformer-based ASR. The machine arguably outperforms the performance of the native speaker. Machine mistakes showed high similarity with the expert linguist transcription. Expert linguist performance in Arabic language is still considerably better than the machine.
- Is Part Of:
- Computer speech & language. Volume 71(2022)
- Journal:
- Computer speech & language
- Issue:
- Volume 71(2022)
- Issue Display:
- Volume 71, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 71
- Issue:
- 2022
- Issue Sort Value:
- 2022-0071-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-01
- Subjects:
- Dialectal arabic -- End-to-end speech recognition -- Human speech recognition -- Modern standard arabic -- Transformer
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2021.101272 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 19368.xml