A classification benchmark for Arabic alphabet phonemes with diacritics in deep neural networks. (January 2022)
- Record Type:
- Journal Article
- Title:
- A classification benchmark for Arabic alphabet phonemes with diacritics in deep neural networks. (January 2022)
- Main Title:
- A classification benchmark for Arabic alphabet phonemes with diacritics in deep neural networks
- Authors:
- Almekhlafi, Eiad
AL-Makhlafi, Moeen
Zhang, Erlei
Wang, Jun
Peng, Jinye - Abstract:
- Abstract: Although the Arabic language is the fourth most popular language in the world, it has not received sufficient attention in artificial intelligence research, especially in automatic speech recognition (ASR). The key feature of the Arabic language is that its words are pronounced exactly as they are written. Above all, taking into account the diacritics, 2 there are no words with similar pronunciation and writing. This motivates us to think of building an Arabic ASR system by recognizing its alphabet phonetics. Therefore, the Arabic alphabet phonemes classification must be studied, this is what the paper aims to achieve. In this paper, we create a new dataset, called Arabic alphabet phonetics dataset (AAPD). AAPD was collected by taking sound recordings of 1420 persons. We build several Arabic alphabet phonemes classification systems using three feature extraction techniques and four deep neural networks. Based on AAPD, we designed numerous experiments to compare the performance of feature extraction and classification methods, which can be used as a benchmark. Experimental results showed that Mel-frequency Cepstral Coefficient (MFCC) is considered most effective to feature extraction due to its highest accuracy, particularly when using 20 for Mel-bands number the training time is the least. Additionally, the appropriate model that achieved the highest accuracy with the least computational load is the proposed model VGG–based, where acquired an accuracy of 95.68%.Abstract: Although the Arabic language is the fourth most popular language in the world, it has not received sufficient attention in artificial intelligence research, especially in automatic speech recognition (ASR). The key feature of the Arabic language is that its words are pronounced exactly as they are written. Above all, taking into account the diacritics, 2 there are no words with similar pronunciation and writing. This motivates us to think of building an Arabic ASR system by recognizing its alphabet phonetics. Therefore, the Arabic alphabet phonemes classification must be studied, this is what the paper aims to achieve. In this paper, we create a new dataset, called Arabic alphabet phonetics dataset (AAPD). AAPD was collected by taking sound recordings of 1420 persons. We build several Arabic alphabet phonemes classification systems using three feature extraction techniques and four deep neural networks. Based on AAPD, we designed numerous experiments to compare the performance of feature extraction and classification methods, which can be used as a benchmark. Experimental results showed that Mel-frequency Cepstral Coefficient (MFCC) is considered most effective to feature extraction due to its highest accuracy, particularly when using 20 for Mel-bands number the training time is the least. Additionally, the appropriate model that achieved the highest accuracy with the least computational load is the proposed model VGG–based, where acquired an accuracy of 95.68%. Highlights: Create a novel phonemes dataset of the Arabic alphabet with diacritics called AAPD. Evaluate three popular techniques' performance for features extraction. Provide a classification benchmark for the AAPD dataset using deep Neural Networks. … (more)
- Is Part Of:
- Computer speech & language. Volume 71(2022)
- Journal:
- Computer speech & language
- Issue:
- Volume 71(2022)
- Issue Display:
- Volume 71, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 71
- Issue:
- 2022
- Issue Sort Value:
- 2022-0071-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-01
- Subjects:
- Arabic alphabet phonemes -- Deep neural networks -- Auto speech recognition -- Mel-frequency Cepstral Coefficient -- Mel-spectrogram
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2021.101274 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 19058.xml