Aggregating discriminative embedding by triple-domain feature joint learning with bidirectional sampling for speaker verification. (May 2023)

Record Type:: Journal Article
Title:: Aggregating discriminative embedding by triple-domain feature joint learning with bidirectional sampling for speaker verification. (May 2023)
Main Title:: Aggregating discriminative embedding by triple-domain feature joint learning with bidirectional sampling for speaker verification
Authors:: Zi, Yunfei
Xiong, Shengwu
Abstract:: Highlights: Improving speaker recognition performance. Proposing a triple-domain feature joint learning to aggregate the discriminative embedding for speaker verification. Proposing bidirectional sampling multi-scale network for capturing the different stages, different scale effective information. Design the Fisher feature fusion method, further enhance speaker individual information and reduce speaker commonality information. Designing the TribiNet model can achieve a robust model, high performance, and accuracy for speaker recognition. Abstract: Each axis of the speech, including time-domain, frequency-domain, and spectral-domain data, that's represents different physical meanings and different dimension information. Time-domain focus on physical signal versus time, the frequency-domain focus on the amount of signal in a given frequency band, and the spectral-domain focus on global power with speech, so only using the spectrogram to represent the whole information of speech to do speaker recognition will lose a lot of details information with the other dimensions. To tackle this limitation, we propose a triple-domain feature joint learning to enhance discriminative embedding from more dimensions for text-independent speaker verification. To further aggregate discriminative embedding, each domain uses a novel bidirectional sampling multi-scale feature aggregation network based on Fisher feature fusion to project spectrum features to more discriminative embeddings, termed … (more)
Is Part Of:: Biomedical signal processing and control. Volume 83(2023)
Journal:: Biomedical signal processing and control
Issue:: Volume 83(2023)
Issue Display:: Volume 83, Issue 2023 (2023)
Year:: 2023
Volume:: 83
Issue:: 2023
Issue Sort Value:: 2023-0083-2023-0000
Page Start:
Page End:
Publication Date:: 2023-05
Subjects:: Discriminative embedding -- Triple-domain feature -- Joint learning -- Bidirectional sampling -- Speaker verification
Signal processing -- Periodicals
Biomedical engineering -- Periodicals
Signal Processing, Computer-Assisted -- Periodicals
Image Processing, Computer-Assisted -- Periodicals
Biomedical Engineering -- Periodicals
610.28
Journal URLs:: http://www.sciencedirect.com/science/journal/17468094 ↗
http://www.elsevier.com/journals ↗
http://www.sciencedirect.com/science?_ob=PublicationURL&_tockey=%23TOC%2329675%232006%23999989998%23626449%23FLA%23&_cdi=29675&_pubType=J&_auth=y&_acct=C000045259&_version=1&_urlVersion=0&_userid=836873&md5=664b5cf9a57fc91971a17faf20c32ec1 ↗
DOI:: 10.1016/j.bspc.2023.104703 ↗
Languages:: English
ISSNs:: 1746-8094
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 2087.880400
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 26178.xml