Multi-style learning with denoising autoencoders for acoustic modeling in the internet of things (IoT). (November 2017)
- Record Type:
- Journal Article
- Title:
- Multi-style learning with denoising autoencoders for acoustic modeling in the internet of things (IoT). (November 2017)
- Main Title:
- Multi-style learning with denoising autoencoders for acoustic modeling in the internet of things (IoT)
- Authors:
- Lin, Payton
Lyu, Dau-Cheng
Chen, Fei
Wang, Syu-Siang
Tsao, Yu - Abstract:
- Highlights: Multi-style Learning (Multi-style training+Deep Learning) relies on multiple layers of nonlinear data transformations to synthesize additional examples of representative variations. Rather than artificially creating data samples, a deep denoising autoencoder (DAE) is instead utilized to extract and organize the most discriminative information in a training set. Multi-style learning combines the DAE-synthesized data with the original data to make class boundaries less sensitive to corruptions by enforcing the back-end model to emphasize on the most discriminative patterns. The proposed data-mixed deep neural networks (DNNs) consistently improved the performance without requiring any preprocessing on the test sets. Abstract: We propose a multi-style learning ( multi-style training + deep learning ) procedure that relies on deep denoising autoencoders (DAEs) to extract and organize the most discriminative information in a training database. Traditionally, multi-style training procedures require either collecting or artificially creating data samples (e.g., by noise injection or data combination) and training a deep neural network (DNN) with all of these different conditions. To expand the applicability of deep learning, the present study instead adopts a DAE to augment the original training set. First, a DAE is utilized to synthesize data that captures useful structure in the input distribution. Next, this synthetic data is combined and mixed within the originalHighlights: Multi-style Learning (Multi-style training+Deep Learning) relies on multiple layers of nonlinear data transformations to synthesize additional examples of representative variations. Rather than artificially creating data samples, a deep denoising autoencoder (DAE) is instead utilized to extract and organize the most discriminative information in a training set. Multi-style learning combines the DAE-synthesized data with the original data to make class boundaries less sensitive to corruptions by enforcing the back-end model to emphasize on the most discriminative patterns. The proposed data-mixed deep neural networks (DNNs) consistently improved the performance without requiring any preprocessing on the test sets. Abstract: We propose a multi-style learning ( multi-style training + deep learning ) procedure that relies on deep denoising autoencoders (DAEs) to extract and organize the most discriminative information in a training database. Traditionally, multi-style training procedures require either collecting or artificially creating data samples (e.g., by noise injection or data combination) and training a deep neural network (DNN) with all of these different conditions. To expand the applicability of deep learning, the present study instead adopts a DAE to augment the original training set. First, a DAE is utilized to synthesize data that captures useful structure in the input distribution. Next, this synthetic data is combined and mixed within the original training set to exploit the powerful capabilities of DNN classifiers to learn the complex decision boundaries in heterogeneous conditions. By assigning a DAE to synthesize additional examples of representative variations, multi-style learning makes class boundaries less sensitive to corruptions by enforcing back-end DNNs to emphasize on the most discriminative patterns. Moreover, this deep learning technique mitigates the cost and time of data collection and is easy to incorporate into the internet of things (IoT). Results showed these data-mixed DNNs provided consistent performance improvements without even requiring any preprocessing on the test sets. … (more)
- Is Part Of:
- Computer speech & language. Volume 46(2017)
- Journal:
- Computer speech & language
- Issue:
- Volume 46(2017)
- Issue Display:
- Volume 46, Issue 2017 (2017)
- Year:
- 2017
- Volume:
- 46
- Issue:
- 2017
- Issue Sort Value:
- 2017-0046-2017-0000
- Page Start:
- 481
- Page End:
- 495
- Publication Date:
- 2017-11
- Subjects:
- Deep learning -- Deep neural networks -- Multi-style training -- Deep denoising autoencoders -- Mixed training -- Representation learning -- Data combination -- Data synthesis -- Noise injection theory -- Feature compensation -- Automatic speech recognition -- Internet of things (IoT)
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2017.02.001 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 4753.xml