A training-based speech regeneration approach with cascading mapping models. (August 2017)
- Record Type:
- Journal Article
- Title:
- A training-based speech regeneration approach with cascading mapping models. (August 2017)
- Main Title:
- A training-based speech regeneration approach with cascading mapping models
- Authors:
- Sharifzadeh, Hamid R.
HajiRassouliha, Amir
McLoughlin, Ian V.
Ardekani, Iman T.
Allen, Jacqueline E.
Sarrafzadeh, Abdolhossein - Abstract:
- Highlights: A two-step cascading reconstruction algorithm for converting whispered speech to normal speech based on a training-based approach is proposed. To lessen the effect of inconsistent spectral features and time alignment between normal and whispered speech, an intermediary layer called "artificial whisper" (or whisperised speech ) is introduced. The results of subjective and objective evaluations are presented to compare the efficiency of the proposed algorithm with other computational methods and Electrolarynx. Abstract: Computational speech reconstruction algorithms have the ultimate aim of returning natural sounding speech to aphonic and dysphonic patients as well as those who can only whisper. In particular, individuals who have lost glottis function due to disease or surgery, retain the power of vocal tract modulation to some degree but they are unable to speak anything more than hoarse whispers without prosthetic aid. While whispering can be seen as a natural and secondary aspect of speech communications for most people, it becomes the primary mechanism of communications for those who have impaired voice production mechanisms, such as laryngectomees. In this paper, by considering the current limitations of speech reconstruction methods, a novel algorithm for converting whispers to normal speech is proposed and the efficiency of the algorithm is explored. The algorithm relies upon cascading mapping models and makes use of artificially generated whispers (calledHighlights: A two-step cascading reconstruction algorithm for converting whispered speech to normal speech based on a training-based approach is proposed. To lessen the effect of inconsistent spectral features and time alignment between normal and whispered speech, an intermediary layer called "artificial whisper" (or whisperised speech ) is introduced. The results of subjective and objective evaluations are presented to compare the efficiency of the proposed algorithm with other computational methods and Electrolarynx. Abstract: Computational speech reconstruction algorithms have the ultimate aim of returning natural sounding speech to aphonic and dysphonic patients as well as those who can only whisper. In particular, individuals who have lost glottis function due to disease or surgery, retain the power of vocal tract modulation to some degree but they are unable to speak anything more than hoarse whispers without prosthetic aid. While whispering can be seen as a natural and secondary aspect of speech communications for most people, it becomes the primary mechanism of communications for those who have impaired voice production mechanisms, such as laryngectomees. In this paper, by considering the current limitations of speech reconstruction methods, a novel algorithm for converting whispers to normal speech is proposed and the efficiency of the algorithm is explored. The algorithm relies upon cascading mapping models and makes use of artificially generated whispers (called whisperised speech) to regenerate natural phonated speech from whispers. Using a training-based approach, the mapping models exploit whisperised speech to overcome frame to frame time alignment problems that are inherent in the speech reconstruction process. This algorithm effectively regenerates missing information in the conventional frameworks of phonated speech reconstruction, and is able to outperform the current state-of-the-art regeneration methods using both subjective and objective criteria. … (more)
- Is Part Of:
- Computers & electrical engineering. Volume 62(2017)
- Journal:
- Computers & electrical engineering
- Issue:
- Volume 62(2017)
- Issue Display:
- Volume 62, Issue 2017 (2017)
- Year:
- 2017
- Volume:
- 62
- Issue:
- 2017
- Issue Sort Value:
- 2017-0062-2017-0000
- Page Start:
- 601
- Page End:
- 611
- Publication Date:
- 2017-08
- Subjects:
- Speech reconstruction -- Whispers -- Electrolarynx -- Laryngectomy -- Time alignment
Computer engineering -- Periodicals
Electrical engineering -- Periodicals
Electrical engineering -- Data processing -- Periodicals
Ordinateurs -- Conception et construction -- Périodiques
Électrotechnique -- Périodiques
Électrotechnique -- Informatique -- Périodiques
Computer engineering
Electrical engineering
Electrical engineering -- Data processing
Periodicals
Electronic journals
621.302854 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00457906/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compeleceng.2017.06.007 ↗
- Languages:
- English
- ISSNs:
- 0045-7906
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.680000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 4714.xml