Recurrent neural network language model adaptation with curriculum learning. (September 2015)
- Record Type:
- Journal Article
- Title:
- Recurrent neural network language model adaptation with curriculum learning. (September 2015)
- Main Title:
- Recurrent neural network language model adaptation with curriculum learning
- Authors:
- Shi, Yangyang
Larson, Martha
Jonker, Catholijn M. - Abstract:
- Abstract : Highlights: We propose three different types of curriculum learning forrnnlm s adaptation. Sub-domain-adapted models outperform general models under oracle situation. Two heuristic methods are proposed to combine the sub-domain-adapted models. The curriculum learning methods can be used as implicit interpolations. Updating of the existingrnnlm s can use curriculum learning without retraining. Abstract: This paper addresses the issue of language model adaptation for Recurrent Neural Network Language Models (rnnlm s), which have recently emerged as a state-of-the-art method for language modeling in the area of speech recognition. Curriculum learning is an established machine learning approach that achieves better models by applying a curriculum, i.e., a well-planned ordering of the training data, during the learning process. Our contribution is to demonstrate the importance of curriculum learning methods for adaptingrnnlm s and to provide key insights on how they should be applied.rnnlm s model language in a continuous space and can theoretically exploit word-dependency information over arbitrarily long distances. These characteristics givernnlm s the ability to learn patterns robustly with relatively little training data, implying that they are well suited for adaptation. In this paper, we focus on two related challenges facing language models: within-domain adaptation and limited-data within-domain adaptation . We propose three types of curricula that start withAbstract : Highlights: We propose three different types of curriculum learning forrnnlm s adaptation. Sub-domain-adapted models outperform general models under oracle situation. Two heuristic methods are proposed to combine the sub-domain-adapted models. The curriculum learning methods can be used as implicit interpolations. Updating of the existingrnnlm s can use curriculum learning without retraining. Abstract: This paper addresses the issue of language model adaptation for Recurrent Neural Network Language Models (rnnlm s), which have recently emerged as a state-of-the-art method for language modeling in the area of speech recognition. Curriculum learning is an established machine learning approach that achieves better models by applying a curriculum, i.e., a well-planned ordering of the training data, during the learning process. Our contribution is to demonstrate the importance of curriculum learning methods for adaptingrnnlm s and to provide key insights on how they should be applied.rnnlm s model language in a continuous space and can theoretically exploit word-dependency information over arbitrarily long distances. These characteristics givernnlm s the ability to learn patterns robustly with relatively little training data, implying that they are well suited for adaptation. In this paper, we focus on two related challenges facing language models: within-domain adaptation and limited-data within-domain adaptation . We propose three types of curricula that start with general data, i.e., characterizing the domain as a whole, and move towards specific data, i.e., characterizing the sub-domain targeted for adaptation. Effectively, these curricula result in a model that can be considered to represent an implicit interpolation between general data and sub-domain-specific data. We carry out an extensive set of experiments that investigates how adaptingrnnlm s using curriculum learning can improve their performance. Our first set of experiments addresses the within-domain adaptation challenge, i.e., creating models that are adapted to specific sub-domains that are part of a larger, heterogeneous domain of speech data. Under this challenge, all training data is available to the system at the time when the language model is trained. First, we demonstrate that curriculum learning can be used to create effective sub-domain-adaptedrnnlm s. Second, we show that a combination of sub-domain-adaptedrnnlm s can be used if the sub-domain of the target data is unknown at test time. Third, we explore the potential of applying combinations of sub-domain-adaptedrnnlm s to data for which sub-domain information is unknown at training time and must be inferred. Our second set of experiments addresses limited-data within-domain adaptation, i.e., adapting an existing model trained on a large set of data using a smaller amount of data from the target sub-domain. Under this challenge, data from the target sub-domain is not available at the time when the language model is trained, but rather becomes available little by little over time. We demonstrate that the implicit interpolation carried out by applying curriculum learning methods tornnlm s outperforms conventional interpolation and has the potential to make more of less adaptation data. … (more)
- Is Part Of:
- Computer speech & language. Volume 33(2015)
- Journal:
- Computer speech & language
- Issue:
- Volume 33(2015)
- Issue Display:
- Volume 33, Issue 2015 (2015)
- Year:
- 2015
- Volume:
- 33
- Issue:
- 2015
- Issue Sort Value:
- 2015-0033-2015-0000
- Page Start:
- 136
- Page End:
- 154
- Publication Date:
- 2015-09
- Subjects:
- Recurrent neural networks -- Language models -- Curriculum learning -- Latent Dirichlet allocation -- Topics -- Socio-situational setting
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2014.11.004 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 6237.xml