Fast i-vector denoising using MAP estimation and a noise distributions database for robust speaker recognition. (September 2017)
- Record Type:
- Journal Article
- Title:
- Fast i-vector denoising using MAP estimation and a noise distributions database for robust speaker recognition. (September 2017)
- Main Title:
- Fast i-vector denoising using MAP estimation and a noise distributions database for robust speaker recognition
- Authors:
- Ben Kheder, Waad
Matrouf, Driss
Bousquet, Pierre-Michel
Bonastre, Jean-François
Ajili, Moez - Abstract:
- Highlights: We use a normal distribution model for both clean and noisy i-vectors. We use an additive model of the noise in the i-vector space. We use a MAP estimator to clean-up noisy i-vectors based on both clean i-vectors and noise distributions in the i-vector space. Abstract: Once the i-vector paradigm has been introduced in the field of speaker recognition, many techniques have been proposed to deal with additive noise within this framework. Due to the complexity of its effect in the i-vector space, a lot of effort has been put into dealing with noise in other domains (speech enhancement, feature compensation, robust i-vector extraction and robust scoring). As far as we know, there was no serious attempt to handle the noise problem directly in the i-vector space without relying on data distributions computed on a prior domain. The aim of this paper is twofold. First, it proposes a full-covariance Gaussian modeling of the clean i-vectors and noise distribution in the i-vector space and introduces a technique to estimate a clean i-vector given the noisy version and the noise density function using the MAP approach. Based on NIST data, we show that it is possible to improve by up to 60% the baseline system performance. Second, in order to make this algorithm usable in a real application and reduce the computational time needed by i-MAP, we propose an extension that requires building a noise distribution database in the i-vector space in an off-line step and using it laterHighlights: We use a normal distribution model for both clean and noisy i-vectors. We use an additive model of the noise in the i-vector space. We use a MAP estimator to clean-up noisy i-vectors based on both clean i-vectors and noise distributions in the i-vector space. Abstract: Once the i-vector paradigm has been introduced in the field of speaker recognition, many techniques have been proposed to deal with additive noise within this framework. Due to the complexity of its effect in the i-vector space, a lot of effort has been put into dealing with noise in other domains (speech enhancement, feature compensation, robust i-vector extraction and robust scoring). As far as we know, there was no serious attempt to handle the noise problem directly in the i-vector space without relying on data distributions computed on a prior domain. The aim of this paper is twofold. First, it proposes a full-covariance Gaussian modeling of the clean i-vectors and noise distribution in the i-vector space and introduces a technique to estimate a clean i-vector given the noisy version and the noise density function using the MAP approach. Based on NIST data, we show that it is possible to improve by up to 60% the baseline system performance. Second, in order to make this algorithm usable in a real application and reduce the computational time needed by i-MAP, we propose an extension that requires building a noise distribution database in the i-vector space in an off-line step and using it later in the test phase. We show that it is possible to achieve comparable results using this approach (up to 57% of relative EER improvement) with a sufficiently large noise distribution database. … (more)
- Is Part Of:
- Computer speech & language. Volume 45(2017)
- Journal:
- Computer speech & language
- Issue:
- Volume 45(2017)
- Issue Display:
- Volume 45, Issue 2017 (2017)
- Year:
- 2017
- Volume:
- 45
- Issue:
- 2017
- Issue Sort Value:
- 2017-0045-2017-0000
- Page Start:
- 104
- Page End:
- 122
- Publication Date:
- 2017-09
- Subjects:
- i-vectors -- MAP adaptation -- Speaker recognition -- Additive noise
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2016.12.007 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 2060.xml