Room-localized spoken command recognition in multi-room, multi-microphone environments. (November 2017)

Record Type:: Journal Article
Title:: Room-localized spoken command recognition in multi-room, multi-microphone environments. (November 2017)
Main Title:: Room-localized spoken command recognition in multi-room, multi-microphone environments
Authors:: Rodomagoulakis, Isidoros
Katsamanis, Athanasios
Potamianos, Gerasimos
Giannoulis, Panagiotis
Tsiami, Antigoni
Maragos, Petros
Abstract:: Highlights: Always-listening recognition pipeline for multi-room smart spaces. Room-localized operation based on multi-room speech activity detection. Channel selection and decision fusion approaches in all pipeline components. Robust acoustic modeling based on far-field data simulation and per-channel adaptation. Systematic pipeline evaluation and optimization on both simulated and real corpora. Abstract: The paper focuses on the design of a practical system pipeline for always-listening, far-field spoken command recognition in everyday smart indoor environments that consist of multiple rooms equipped with sparsely distributed microphone arrays. Such environments, for example domestic and multi-room offices, present challenging acoustic scenes to state-of-the-art speech recognizers, especially under always-listening operation, due to low signal-to-noise ratios, frequent overlaps of target speech, acoustic events, and background noise, as well as inter-room interference and reverberation. In addition, recognition of target commands often needs to be accompanied by their spatial localization, at least at the room level, to account for users in different rooms, providing command disambiguation and room-localized feedback. To address the above requirements, the use of parallel recognition pipelines is proposed, one per room of interest. The approach is enabled by a room-dependent speech activity detection module that employs appropriate multichannel features to determine speech … (more)
Is Part Of:: Computer speech & language. Volume 46(2017)
Journal:: Computer speech & language
Issue:: Volume 46(2017)
Issue Display:: Volume 46, Issue 2017 (2017)
Year:: 2017
Volume:: 46
Issue:: 2017
Issue Sort Value:: 2017-0046-2017-0000
Page Start:: 419
Page End:: 443
Publication Date:: 2017-11
Subjects:: Smart homes -- Distant speech recognition -- Speech activity detection -- Keyword spotting -- Multichannel processing -- Decision fusion -- Beamforming -- Channel selection
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2017.02.004 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 2908.xml