Natural language generation from Universal Dependencies using data augmentation and pre-trained language models. (13th January 2023)
- Record Type:
- Journal Article
- Title:
- Natural language generation from Universal Dependencies using data augmentation and pre-trained language models. (13th January 2023)
- Main Title:
- Natural language generation from Universal Dependencies using data augmentation and pre-trained language models
- Authors:
- Nguyen, Dang Tuan
Tran, Trung - Abstract:
- Natural language generation (NLG) has focused on data-to-text tasks with different structured inputs in recent years. The generated text should contain given information, be grammatically correct, and meet other criteria. We propose in this research an approach that combines solid pre-trained language models with input data augmentation. The studied data in this work are Universal Dependencies (UDs) which is developed as a framework for consistent annotation of grammar (parts of speech, morphological features and syntactic dependencies) for cross-lingual learning. We study the English UD structures, which are modified into two groups. In the first group, the modification phase is to remove the order information of each word and lemmatise the tokens. In the second group, the modification phase is to remove the functional words and surface-oriented morphological details. With both groups of modified structures, we apply the same approach to explore how pre-trained sequence-to-sequence models text-to-text transfer transformer (T5) and BART perform on the training data. We augment the training data by creating several permutations for each input structure. The result shows that our approach can generate good quality English text with the exciting idea of studying strategies to represent UD inputs.
- Is Part Of:
- International journal of intelligent information and database systems. Volume 16:Number 1(2023)
- Journal:
- International journal of intelligent information and database systems
- Issue:
- Volume 16:Number 1(2023)
- Issue Display:
- Volume 16, Issue 1 (2023)
- Year:
- 2023
- Volume:
- 16
- Issue:
- 1
- Issue Sort Value:
- 2023-0016-0001-0000
- Page Start:
- 89
- Page End:
- 105
- Publication Date:
- 2023-01-13
- Subjects:
- data-to-text generation -- Universal Dependencies -- pre-trained language models -- data augmentation -- deep learning -- sequence-to-sequence models -- fine-tune
Database management -- Computer programs -- Periodicals
Information retrieval -- Computer programs -- Periodicals
Information storage and retrieval systems -- Computer programs -- Periodicals
Artificial intelligence -- Periodicals
Expert systems (Computer science) -- Periodicals
Intelligent agents (Computer software) -- Periodicals
006.33 - Journal URLs:
- http://www.inderscience.com/jhome.php?jcode=ijiids ↗
http://www.inderscience.com/ ↗ - Languages:
- English
- ISSNs:
- 1751-5858
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 24526.xml