Construction of biological networks from unstructured information based on a semi-automated curation workflow. (16th June 2015)
- Record Type:
- Journal Article
- Title:
- Construction of biological networks from unstructured information based on a semi-automated curation workflow. (16th June 2015)
- Main Title:
- Construction of biological networks from unstructured information based on a semi-automated curation workflow
- Authors:
- Szostak, Justyna
Ansari, Sam
Madan, Sumit
Fluck, Juliane
Talikka, Marja
Iskandar, Anita
De Leon, Hector
Hofmann-Apitius, Martin
Peitsch, Manuel C.
Hoeng, Julia - Abstract:
- Abstract: Capture and representation of scientific knowledge in a structured format are essential to improve the understanding of biological mechanisms involved in complex diseases. Biological knowledge and knowledge about standardized terminologies are difficult to capture from literature in a usable form. A semi-automated knowledge extraction workflow is presented that was developed to allow users to extract causal and correlative relationships from scientific literature and to transcribe them into the computable and human readable Biological Expression Language (BEL). The workflow combines state-of-the-art linguistic tools for recognition of various entities and extraction of knowledge from literature sources. Unlike most other approaches, the workflow outputs the results to a curation interface for manual curation and converts them into BEL documents that can be compiled to form biological networks. We developed a new semi-automated knowledge extraction workflow that was designed to capture and organize scientific knowledge and reduce the required curation skills and effort for this task. The workflow was used to build a network that represents the cellular and molecular mechanisms implicated in atherosclerotic plaque destabilization in an apolipoprotein-E-deficient (ApoE −/− ) mouse model. The network was generated using knowledge extracted from the primary literature. The resultant atherosclerotic plaque destabilization network contains 304 nodes and 743 edgesAbstract: Capture and representation of scientific knowledge in a structured format are essential to improve the understanding of biological mechanisms involved in complex diseases. Biological knowledge and knowledge about standardized terminologies are difficult to capture from literature in a usable form. A semi-automated knowledge extraction workflow is presented that was developed to allow users to extract causal and correlative relationships from scientific literature and to transcribe them into the computable and human readable Biological Expression Language (BEL). The workflow combines state-of-the-art linguistic tools for recognition of various entities and extraction of knowledge from literature sources. Unlike most other approaches, the workflow outputs the results to a curation interface for manual curation and converts them into BEL documents that can be compiled to form biological networks. We developed a new semi-automated knowledge extraction workflow that was designed to capture and organize scientific knowledge and reduce the required curation skills and effort for this task. The workflow was used to build a network that represents the cellular and molecular mechanisms implicated in atherosclerotic plaque destabilization in an apolipoprotein-E-deficient (ApoE −/− ) mouse model. The network was generated using knowledge extracted from the primary literature. The resultant atherosclerotic plaque destabilization network contains 304 nodes and 743 edges supported by 33 PubMed referenced articles. A comparison between the semi-automated and conventional curation processes showed similar results, but significantly reduced curation effort for the semi-automated process. Creating structured knowledge from unstructured text is an important step for the mechanistic interpretation and reusability of knowledge. Our new semi-automated knowledge extraction workflow reduced the curation skills and effort required to capture and organize scientific knowledge. The atherosclerotic plaque destabilization network that was generated is a causal network model for vascular disease demonstrating the usefulness of the workflow for knowledge extraction and construction of mechanistically meaningful biological networks. … (more)
- Is Part Of:
- Database. Volume 2015(2015)
- Journal:
- Database
- Issue:
- Volume 2015(2015)
- Issue Display:
- Volume 2015, Issue 2015 (2015)
- Year:
- 2015
- Volume:
- 2015
- Issue:
- 2015
- Issue Sort Value:
- 2015-2015-2015-0000
- Page Start:
- Page End:
- Publication Date:
- 2015-06-16
- Subjects:
- Biology -- Databases -- Periodicals
Bioinformatics -- Periodicals
570.285 - Journal URLs:
- http://database.oxfordjournals.org/ ↗
http://ukcatalogue.oup.com/ ↗ - DOI:
- 10.1093/database/bav057 ↗
- Languages:
- English
- ISSNs:
- 1758-0463
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12689.xml