Transformers-based information extraction with limited data for domain-specific business documents. (January 2021)
- Record Type:
- Journal Article
- Title:
- Transformers-based information extraction with limited data for domain-specific business documents. (January 2021)
- Main Title:
- Transformers-based information extraction with limited data for domain-specific business documents
- Authors:
- Nguyen, Minh-Tien
Le, Dung Tien
Le, Linh - Abstract:
- Abstract: Information extraction plays an important role for data transformation in business cases. However, building extraction systems in actual cases face two challenges: (i) the availability of labeled data is usually limited and (ii) highly detailed classification is required. This paper introduces a model for addressing the two challenges. Different from prior studies that usually require a large number of training samples, our extraction model is trained with a small number of data for extracting a large number of information types. To do that, the model takes into account the contextual aspect of pre-trained language models trained on a huge amount of data on general domains for word representation. To adapt to our downstream task, the model employs transfer learning by stacking Convolutional Neural Networks to learn hidden representation for classification. To confirm the efficiency of our method, we apply the model to two actual cases of document processing for bidding and sale documents of two Japanese companies. Experimental results on real testing sets show that, with a small number of training data, our model achieves high accuracy accepted by our clients.
- Is Part Of:
- Engineering applications of artificial intelligence. Volume 97(2021)
- Journal:
- Engineering applications of artificial intelligence
- Issue:
- Volume 97(2021)
- Issue Display:
- Volume 97, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 97
- Issue:
- 2021
- Issue Sort Value:
- 2021-0097-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-01
- Subjects:
- Information extraction -- Transfer learning -- Transformers
Engineering -- Data processing -- Periodicals
Artificial intelligence -- Periodicals
Expert systems (Computer science) -- Periodicals
Ingénierie -- Informatique -- Périodiques
Intelligence artificielle -- Périodiques
Systèmes experts (Informatique) -- Périodiques
Artificial intelligence
Engineering -- Data processing
Expert systems (Computer science)
Periodicals
620.00285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09521976 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.engappai.2020.104100 ↗
- Languages:
- English
- ISSNs:
- 0952-1976
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3755.704500
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 14985.xml