State gradients for analyzing memory in LSTM language models. (May 2020)
- Record Type:
- Journal Article
- Title:
- State gradients for analyzing memory in LSTM language models. (May 2020)
- Main Title:
- State gradients for analyzing memory in LSTM language models
- Authors:
- Verwimp, Lyan
Van hamme, Hugo
Wambacq, Patrick - Abstract:
- Highlights: Decreasing state gradient SVs correspond to decreasing importance of the input. The average memory of LSTM LMs decays quickly but not exponentially over time. Average memory depends on type of data, selectiveness on the size of the dataset Infrequent words are remembered better and longer by LSTM LMs. LSTM LMs remember on average (syntactic differences between) nouns best. Abstract: Gradients can be used to train neural networks, but they can also be used to interpret them. We investigate how well the inputs of RNNs are remembered by their state by calculating 'state gradients', and applying SVD on the gradient matrix to reveal which directions in embedding space are remembered and to what extent. Our method can be applied to any RNN and reveals which properties in the embedding space influence the state space, without the need to know and label the properties beforehand. In this paper, we propose a normalization method that alleviates the influence of variance in embedding space on the state gradients and show the effectiveness of our method on a synthetic dataset. Additionally, the influence of several training settings on the RNN memory is investigated, and a more fine-grained analysis based on POS and word types shows that LSTM language models learn to model linguistic intuitions. Our code and datasets are publicly available.
- Is Part Of:
- Computer speech & language. Volume 61(2020)
- Journal:
- Computer speech & language
- Issue:
- Volume 61(2020)
- Issue Display:
- Volume 61, Issue 2020 (2020)
- Year:
- 2020
- Volume:
- 61
- Issue:
- 2020
- Issue Sort Value:
- 2020-0061-2020-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-05
- Subjects:
- Language modeling -- Neural networks -- Memory -- Gradients -- Interpreting
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2019.101034 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12564.xml