A deep learning relation extraction approach to support a biomedical semi-automatic curation task: The case of the gluten bibliome. (1st June 2022)
- Record Type:
- Journal Article
- Title:
- A deep learning relation extraction approach to support a biomedical semi-automatic curation task: The case of the gluten bibliome. (1st June 2022)
- Main Title:
- A deep learning relation extraction approach to support a biomedical semi-automatic curation task: The case of the gluten bibliome
- Authors:
- Pérez-Pérez, Martín
Ferreira, Tânia
Igrejas, Gilberto
Fdez-Riverola, Florentino - Abstract:
- Highlights: A semi-automatic workflow to curate health-related interactions in the bibliome. Different text minning an ontology-based methods were combined and applied. A deep learning model to assist manual curators learning from their decisions. Different state-of-the-art machine learning models were compared. Results showed that the presented workflow is a valuable approach to assist curators. Abstract: Discover relevant biomedical interactions in the literature is crucial for enhancing biology research. This curation process has an essential role in studying the different processes and interactions reported that affect the biological process (e.g., genome, metabolome, and transcriptome). In this sense, the objective of this work is twofold: reduce the manual effort required to curate and review the existing biochemical interactions reported in the gluten-related bibliome, while proposing a novel vector-space integrated into a deep learning model to assists manual curators in a real curation task by learning from their previous decisions. With this objective, the present work proposes a novel vector-space that combine (i) high-level lexical and syntactic inference features as Wordnets and Health-related domain ontologies, (ii) unsupervised semantic resources as word embedding, (iii) semantic and syntactic sentence knowledge, (iv) abbreviation resolution support, (v) several state-of-the-art Named-entity recognition methods, and, finally, (vi) different featureHighlights: A semi-automatic workflow to curate health-related interactions in the bibliome. Different text minning an ontology-based methods were combined and applied. A deep learning model to assist manual curators learning from their decisions. Different state-of-the-art machine learning models were compared. Results showed that the presented workflow is a valuable approach to assist curators. Abstract: Discover relevant biomedical interactions in the literature is crucial for enhancing biology research. This curation process has an essential role in studying the different processes and interactions reported that affect the biological process (e.g., genome, metabolome, and transcriptome). In this sense, the objective of this work is twofold: reduce the manual effort required to curate and review the existing biochemical interactions reported in the gluten-related bibliome, while proposing a novel vector-space integrated into a deep learning model to assists manual curators in a real curation task by learning from their previous decisions. With this objective, the present work proposes a novel vector-space that combine (i) high-level lexical and syntactic inference features as Wordnets and Health-related domain ontologies, (ii) unsupervised semantic resources as word embedding, (iii) semantic and syntactic sentence knowledge, (iv) abbreviation resolution support, (v) several state-of-the-art Named-entity recognition methods, and, finally, (vi) different feature construction and optimization techniques to support a semi-automatic curation workflow. Therefore, the application of the proposed workflow over a classified set of 2, 451 relevant gluten-related documents produces a total of 8, 349 relevant and 471, 813 irrelevant relations distributed in thirteen domain health-related categories. Experimental results showed that the proposed workflow is a valuable approach for a semi-automatic relation extraction task. It was able to obtain satisfactory results in the early stages of a real-world curation task and saved manual annotation efforts by learning from the decisions made by manual curators in iterative annotation rounds. The average F.score for the proposed relation categories was 0.731, being the lowest F.score at 0.47 and the highest F.score at 0.929. The different resources used in this work as well as the manually curated corpus are public available on our GitHub repository. … (more)
- Is Part Of:
- Expert systems with applications. Volume 195(2022)
- Journal:
- Expert systems with applications
- Issue:
- Volume 195(2022)
- Issue Display:
- Volume 195, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 195
- Issue:
- 2022
- Issue Sort Value:
- 2022-0195-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-06-01
- Subjects:
- Text mining -- Relation extraction -- Deep learning -- Ontology-based methods -- Literature curation -- Gluten
Expert systems (Computer science) -- Periodicals
Systèmes experts (Informatique) -- Périodiques
Electronic journals
006.33 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09574174 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.eswa.2022.116616 ↗
- Languages:
- English
- ISSNs:
- 0957-4174
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3842.004220
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 21000.xml