Speaker clustering quality estimation with logistic regression. (January 2021)
- Record Type:
- Journal Article
- Title:
- Speaker clustering quality estimation with logistic regression. (January 2021)
- Main Title:
- Speaker clustering quality estimation with logistic regression
- Authors:
- Cohen, Yishai
Lapidot, Itshak - Abstract:
- Abstract: This paper focuses on estimating the quality of a clustering process. The task is to cluster short speech segments that belong to different speakers. A variety of statistical parameters are estimated from the output of the clustering process. These parameters are used to train a logistic regression to serve as a clustering quality estimation system. In this paper, mean-shift clustering with either a cosine distance or probabilistic linear discriminant analysis (PLDA) score as the similarity measure, as well as stochastic vector quantization (VQ) with cosine distance, are applied in order to cluster the short speaker segments, which are represented by i-vectors. The quality of the clustering is measured using the average cluster purity ( ACP ), average speaker purity ( ASP ) and K, which is the geometric mean of ASP and ACP . We show that these measures can be estimated fairly well by applying logistic regression. Moreover, clustering quality may be well estimated even if the logistic regression was trained using parameters derived from a different clustering algorithm. This is very important, as it allows the use of a single quality estimation system, without the need for retraining when the clustering method is changed. Additionally, we showed how the clustering quality estimator could be served as an estimator of the number of clusters. For VQ-based clustering the number of clusters has to be predefined. We perform the clustering with different number ofAbstract: This paper focuses on estimating the quality of a clustering process. The task is to cluster short speech segments that belong to different speakers. A variety of statistical parameters are estimated from the output of the clustering process. These parameters are used to train a logistic regression to serve as a clustering quality estimation system. In this paper, mean-shift clustering with either a cosine distance or probabilistic linear discriminant analysis (PLDA) score as the similarity measure, as well as stochastic vector quantization (VQ) with cosine distance, are applied in order to cluster the short speaker segments, which are represented by i-vectors. The quality of the clustering is measured using the average cluster purity ( ACP ), average speaker purity ( ASP ) and K, which is the geometric mean of ASP and ACP . We show that these measures can be estimated fairly well by applying logistic regression. Moreover, clustering quality may be well estimated even if the logistic regression was trained using parameters derived from a different clustering algorithm. This is very important, as it allows the use of a single quality estimation system, without the need for retraining when the clustering method is changed. Additionally, we showed how the clustering quality estimator could be served as an estimator of the number of clusters. For VQ-based clustering the number of clusters has to be predefined. We perform the clustering with different number of clusters. The best number of clusters is estimated as the clustering that achieved the higher estimation of the K value. We will show that this approach estimate the best number of clusters accurately. … (more)
- Is Part Of:
- Computer speech & language. Volume 65(2021)
- Journal:
- Computer speech & language
- Issue:
- Volume 65(2021)
- Issue Display:
- Volume 65, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 65
- Issue:
- 2021
- Issue Sort Value:
- 2021-0065-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-01
- Subjects:
- Clustering quality estimation -- Speaker clustering -- Mean-shift -- Stochastic vector quantization (VQ) -- Probabilistic linear discriminant analysis (PLDA) -- Cosine distance -- Logistic regression
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2020.101139 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 16859.xml