A Spanish multispeaker database of esophageal speech. (March 2021)
- Record Type:
- Journal Article
- Title:
- A Spanish multispeaker database of esophageal speech. (March 2021)
- Main Title:
- A Spanish multispeaker database of esophageal speech
- Authors:
- Serrano García, Luis
Raman, Sneha
Hernáez Rioja, Inma
Navas Cordón, Eva
Sanchez, Jon
Saratxaga, Ibon - Abstract:
- Highlights: A novel database of Spanish esophageal voices designed for developing new software for speakers with this pathology. Includes recordings of 100 sentences from 30 esophageal speakers. The size and variety of the database makes it unique. Includes the phonetic labels of the recordings. Automatic phonetic labelling procedure is described. The main acoustic characteristics of the voices are described and compared with a reduced set of healthy voices. The database has successfully been used for reducing Automatic Speech Recognition errors for esophageal speakers. Abstract: A laryngectomee is a person whose larynx has been removed by surgery, usually due to laryngeal cancer. After surgery, most laryngectomees are able to speak again, using techniques that are learned with the help of a speech therapist. This is termed as alaryngeal speech, and esophageal speech (ES) is one of the several alaryngeal speech production modes. A considerable amount of research has been dedicated to the study of alaryngeal speech, with a wide range of aims such as helping speech therapists with evaluation and diagnosis, and improving its quality and intelligibility using digital signal processing techniques. We present to you a database of Spanish ES voices, named AhoSLABI, which is designed to allow the development of new support technologies for this speech impairment. The database primarily consists of recordings of 31 laryngectomees (27 males and 4 females) pronouncing phoneticallyHighlights: A novel database of Spanish esophageal voices designed for developing new software for speakers with this pathology. Includes recordings of 100 sentences from 30 esophageal speakers. The size and variety of the database makes it unique. Includes the phonetic labels of the recordings. Automatic phonetic labelling procedure is described. The main acoustic characteristics of the voices are described and compared with a reduced set of healthy voices. The database has successfully been used for reducing Automatic Speech Recognition errors for esophageal speakers. Abstract: A laryngectomee is a person whose larynx has been removed by surgery, usually due to laryngeal cancer. After surgery, most laryngectomees are able to speak again, using techniques that are learned with the help of a speech therapist. This is termed as alaryngeal speech, and esophageal speech (ES) is one of the several alaryngeal speech production modes. A considerable amount of research has been dedicated to the study of alaryngeal speech, with a wide range of aims such as helping speech therapists with evaluation and diagnosis, and improving its quality and intelligibility using digital signal processing techniques. We present to you a database of Spanish ES voices, named AhoSLABI, which is designed to allow the development of new support technologies for this speech impairment. The database primarily consists of recordings of 31 laryngectomees (27 males and 4 females) pronouncing phonetically balanced sentences. Additionally, it includes parallel recordings of the sentences by 9 healthy speakers (6 males and 3 females) to facilitate speech processing tasks that require small parallel corpora, such as voice conversion or synthetic speech adaptation. Apart from the sentences, the database includes sustained vowels and a small set of isolated words, which can be valuable for research on ES analysis, diagnosis and evaluation. The paper describes the main contents of the database, the recording protocols and procedure, as well as the labeling process. The main acoustic characteristics of the voices, such as speaking rate, durations of the recordings, phones and silences, and other such characteristics are compared with those of a reduced set of healthy voices. In addition, we describe an experiment using the database to improve the performance of an ASR system for ES speakers. This new resource will be made available to the scientific community with the hope that it will be used to improve the quality of life of the laryngectomees. … (more)
- Is Part Of:
- Computer speech & language. Volume 66(2021)
- Journal:
- Computer speech & language
- Issue:
- Volume 66(2021)
- Issue Display:
- Volume 66, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 66
- Issue:
- 2021
- Issue Sort Value:
- 2021-0066-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-03
- Subjects:
- Esophageal speech -- Voice conversion -- Speech databases -- Speech intelligibility -- Speech analysis
00-01 -- 99-00
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2020.101168 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 15413.xml