SILT: Efficient transformer training for inter-lingual inference. (15th August 2022)
- Record Type:
- Journal Article
- Title:
- SILT: Efficient transformer training for inter-lingual inference. (15th August 2022)
- Main Title:
- SILT: Efficient transformer training for inter-lingual inference
- Authors:
- Huertas-Tato, Javier
Martín, Alejandro
Camacho, David - Abstract:
- Abstract: The ability of transformers to perform precision tasks such as question answering, Natural Language Inference (NLI) or summarizing, has enabled them to be ranked as one of the best paradigms to address Natural Language Processing (NLP) tasks. NLI is one of the best scenarios to test these architectures, due to the knowledge required to understand complex sentences and established relationships between a hypothesis and a premise. Nevertheless, these models suffer from the incapacity to generalize to other domains or from difficulties to face multilingual and interlingual scenarios. The leading pathway in the literature to address these issues involve designing and training extremely large architectures, but this causes unpredictable behaviors and establishes barriers which impede broad access and fine tuning. In this paper, we propose a new architecture called Siamese Inter-Lingual Transformer (SILT). This architecture is able to efficiently align multilingual embeddings for Natural Language Inference, allowing for unmatched language pairs to be processed. SILT leverages siamese pre-trained multi-lingual transformers with frozen weights where the two input sentences attend to each other to later be combined through a matrix alignment method. The experimental results carried out in this paper evidence that SILT allows to reduce drastically the number of trainable parameters while allowing for inter-lingual NLI and achieving state-of-the-art performance on commonAbstract: The ability of transformers to perform precision tasks such as question answering, Natural Language Inference (NLI) or summarizing, has enabled them to be ranked as one of the best paradigms to address Natural Language Processing (NLP) tasks. NLI is one of the best scenarios to test these architectures, due to the knowledge required to understand complex sentences and established relationships between a hypothesis and a premise. Nevertheless, these models suffer from the incapacity to generalize to other domains or from difficulties to face multilingual and interlingual scenarios. The leading pathway in the literature to address these issues involve designing and training extremely large architectures, but this causes unpredictable behaviors and establishes barriers which impede broad access and fine tuning. In this paper, we propose a new architecture called Siamese Inter-Lingual Transformer (SILT). This architecture is able to efficiently align multilingual embeddings for Natural Language Inference, allowing for unmatched language pairs to be processed. SILT leverages siamese pre-trained multi-lingual transformers with frozen weights where the two input sentences attend to each other to later be combined through a matrix alignment method. The experimental results carried out in this paper evidence that SILT allows to reduce drastically the number of trainable parameters while allowing for inter-lingual NLI and achieving state-of-the-art performance on common benchmarks. Highlights: Efficient alignment of multilingual embeddings for Natural Language Inference. Siamese pretrained multilingual transformers with frozen weights and mutual attention. A curated Spanish version of the SICK dataset, called SICK-es is provided. Drastic reduction of trainable parameters and ability to perform inter-lingual tasks. … (more)
- Is Part Of:
- Expert systems with applications. Volume 200(2022)
- Journal:
- Expert systems with applications
- Issue:
- Volume 200(2022)
- Issue Display:
- Volume 200, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 200
- Issue:
- 2022
- Issue Sort Value:
- 2022-0200-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-08-15
- Subjects:
- Natural language inference -- Embeddings -- Sentence alignment -- Transformers -- Deep learning
Expert systems (Computer science) -- Periodicals
Systèmes experts (Informatique) -- Périodiques
Electronic journals
006.33 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09574174 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.eswa.2022.116923 ↗
- Languages:
- English
- ISSNs:
- 0957-4174
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3842.004220
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 21406.xml