Psycho-acoustics inspired automatic speech recognition. (July 2021)
- Record Type:
- Journal Article
- Title:
- Psycho-acoustics inspired automatic speech recognition. (July 2021)
- Main Title:
- Psycho-acoustics inspired automatic speech recognition
- Authors:
- Coro, Gianpaolo
Massoli, Fabio Valerio
Origlia, Antonio
Cutugno, Francesco - Abstract:
- Abstract: Understanding the human spoken language recognition process is still a far scientific goal. Nowadays, commercial automatic speech recognisers (ASRs) achieve high performance at recognising clean speech, but their approaches are poorly related to human speech recognition. They commonly process the phonetic structure of speech while neglecting supra-segmental and syllabic tracts integral to human speech recognition. As a result, these ASRs achieve low performance on spontaneous speech and require enormous costs to build up phonetic and pronunciation models and catch the large variability of human speech. This paper presents a novel ASR that addresses these issues and questions conventional ASR approaches. It uses alternative acoustic models and an exhaustive decoding algorithm to process speech at a syllabic temporal scale (100–250 ms) through a multi-temporal approach inspired by psycho-acoustic studies. Performance comparison on the recognition of spoken Italian numbers (from 0 to 1 million) demonstrates that our approach is cost-effective, outperforms standard phonetic models, and reaches state-of-the-art performance. Highlights: We propose a novel Automatic Speech Recognizer inspired by psycho-acoustic studies. It embeds multiple-time scale speech processing inspired to human speech processing. Its optimal architecture uses an LSTM-based acoustic model and 1h training audio. It has comparable performance to conventional ASRs on 1-million numbers recognition.
- Is Part Of:
- Computers & electrical engineering. Volume 93(2021)
- Journal:
- Computers & electrical engineering
- Issue:
- Volume 93(2021)
- Issue Display:
- Volume 93, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 93
- Issue:
- 2021
- Issue Sort Value:
- 2021-0093-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-07
- Subjects:
- Automatic speech recognition -- Deep learning -- Long short term memory -- Convolutional neural networks -- Factorial hidden Markov models -- Hidden Markov models -- Speech -- Psycho-acoustics -- Syllables
Computer engineering -- Periodicals
Electrical engineering -- Periodicals
Electrical engineering -- Data processing -- Periodicals
Ordinateurs -- Conception et construction -- Périodiques
Électrotechnique -- Périodiques
Électrotechnique -- Informatique -- Périodiques
Computer engineering
Electrical engineering
Electrical engineering -- Data processing
Periodicals
Electronic journals
621.302854 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00457906/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compeleceng.2021.107238 ↗
- Languages:
- English
- ISSNs:
- 0045-7906
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.680000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 18863.xml