Combining evidences from magnitude and phase information using VTEO for person recognition using humming. (November 2018)

Record Type:: Journal Article
Title:: Combining evidences from magnitude and phase information using VTEO for person recognition using humming. (November 2018)
Main Title:: Combining evidences from magnitude and phase information using VTEO for person recognition using humming
Authors:: Patil, Hemant A.
Madhavi, Maulik C.
Abstract:: Highlights: A novel feature-based approach for person recognition using humming is presented. Newly proposed VTEO is used to compute subband energy in time-domain. The phase information is embedded implicitly in subband time-domain signal. Various evaluation factors such as noise robustness, feature vector dimension, optimal DI, static vs. dynamic features, etc were examined. Abstract: Most of the state-of-the-art speaker recognition system use natural speech signal (i.e., real speech, spontaneous speech or contextual speech) from the subjects. In this paper, recognition of a person is attempted from his or her hum with the help of machines. This kind of application can be useful to design person-dependent Query-by-Humming (QBH) system and hence, plays an important role in music information retrieval (MIR) system. In addition, it can be also useful for other interesting speech technological applications such as human-computer interaction, speech prosody analysis of disordered speech, and speaker forensics. This paper develops new feature extraction technique to exploit perceptually meaningful (due to mel frequency warping to imitate human perception process for hearing) phase spectrum information along with magnitude spectrum information from the hum signal. In particular, the structure of state-of-the-art feature set, namely, Mel Frequency Cepstral Coefficients (MFCCs) is modified to capture the phase spectrum information. In addition, a new energy measure, namely, Variable … (more)
Is Part Of:: Computer speech & language. Volume 52(2018)
Journal:: Computer speech & language
Issue:: Volume 52(2018)
Issue Display:: Volume 52, Issue 2018 (2018)
Year:: 2018
Volume:: 52
Issue:: 2018
Issue Sort Value:: 2018-0052-2018-0000
Page Start:: 225
Page End:: 256
Publication Date:: 2018-11
Subjects:: Music information retrieval (MIR) -- Person recognition -- Humming -- Mel filterbank -- Variable length Teager Energy Operator (VTEO) -- Polynomial classifier
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2017.06.009 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 17055.xml