Impact of Word Error Rate on theme identification task of highly imperfect human–human conversations. (July 2016)

Record Type:: Journal Article
Title:: Impact of Word Error Rate on theme identification task of highly imperfect human–human conversations. (July 2016)
Main Title:: Impact of Word Error Rate on theme identification task of highly imperfect human–human conversations
Authors:: Morchid, Mohamed
Dufour, Richard
Linarès, Georges
Abstract:: Abstract : Highlights: Review of the impact of dialogue representations and classification methods. We discuss the impact of discriminative words in terms of transcription accuracy. Original study evaluating the impact of the WER in the LDA topic space. Abstract: A review is proposed of the impact of word representations and classification methods in the task of theme identification of telephone conversation services having highly imperfect automatic transcriptions. We firstly compare two word-based representations using the classical Term Frequency-Inverse Document Frequency with Gini purity criteria (TF-IDF-Gini) method and the latent Dirichlet allocation (LDA) approach. We then introduce a classification method that takes advantage of the LDA topic space representation, highlighted as the best word representation. To do so, two assumptions about topic representation led us to choose a Gaussian Process (GP) based method. Its performance is compared with a classical Support Vector Machine (SVM) classification method. Experiments showed that the GP approach is a better solution to deal with the multiple theme complexity of a dialogue, no matter the conditions studied (manual or automatic transcriptions) (Morchid et al., 2014 ). In order to better understand results obtained using different word representation methods and classification approaches, we then discuss the impact of discriminative and non-discriminative words extracted by both word representations methods in terms … (more)
Is Part Of:: Computer speech & language. Volume 38(2016)
Journal:: Computer speech & language
Issue:: Volume 38(2016)
Issue Display:: Volume 38, Issue 2016 (2016)
Year:: 2016
Volume:: 38
Issue:: 2016
Issue Sort Value:: 2016-0038-2016-0000
Page Start:: 68
Page End:: 85
Publication Date:: 2016-07
Subjects:: Speech analytics -- Human–human dialogue -- Latent Dirichlet allocation -- Topic representation -- Principal component analysis -- Classification performance study
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2015.12.001 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 2379.xml