Extracting causal relations from the literature with word vector mapping. (December 2019)
- Record Type:
- Journal Article
- Title:
- Extracting causal relations from the literature with word vector mapping. (December 2019)
- Main Title:
- Extracting causal relations from the literature with word vector mapping
- Authors:
- An, Ning
Xiao, Yongbo
Yuan, Jing
Yang, Jiaoyun
Alterovitz, Gil - Abstract:
- Abstract: Causal graphs play an essential role in the determination of causalities and have been applied in many domains including biology and medicine. Traditional causal graph construction methods are usually data-driven and may not deliver the desired accuracy of a graph. Considering the vast number of publications with causality knowledge, extracting causal relations from the literature to help to establish causal graphs becomes possible. Current supervised-learning-based causality extraction methods requires sufficient labeled data to train a model, and rule-based causality extraction methods are limited by the predefined patterns. This paper proposes a causality extraction framework by integrating rule-based methods and unsupervised learning models to overcome these limitations. The proposed method consists of three modules, including data preprocessing, syntactic pattern matching, and causality determination. In data preprocessing, abstracts are crawled based on attribute names before sentences are extracted and simplified. In syntactic pattern matching, these simplified sentences are parsed to obtain the part-of-speech tags, and triples are achieved based on these tags by matching the two designed syntactic patterns. In causality determination, four verb seed sets are initialized, and word vectors are constructed for the verbs in both the seed sets and the triples by applying an unsupervised machine learning model. Causal relations are identified by comparing theAbstract: Causal graphs play an essential role in the determination of causalities and have been applied in many domains including biology and medicine. Traditional causal graph construction methods are usually data-driven and may not deliver the desired accuracy of a graph. Considering the vast number of publications with causality knowledge, extracting causal relations from the literature to help to establish causal graphs becomes possible. Current supervised-learning-based causality extraction methods requires sufficient labeled data to train a model, and rule-based causality extraction methods are limited by the predefined patterns. This paper proposes a causality extraction framework by integrating rule-based methods and unsupervised learning models to overcome these limitations. The proposed method consists of three modules, including data preprocessing, syntactic pattern matching, and causality determination. In data preprocessing, abstracts are crawled based on attribute names before sentences are extracted and simplified. In syntactic pattern matching, these simplified sentences are parsed to obtain the part-of-speech tags, and triples are achieved based on these tags by matching the two designed syntactic patterns. In causality determination, four verb seed sets are initialized, and word vectors are constructed for the verbs in both the seed sets and the triples by applying an unsupervised machine learning model. Causal relations are identified by comparing the similarity between the verbs in each triple and that in each seed set to overcome the limitation of the seed sets. Causality extraction results on the attributes from the risk factors for Alzheimer's disease show that our method outperforms Bui's method and Alashri's method in terms of precision, recall, specificity, accuracy and F-score, with increases in the F-score of 8.29% and 5.37%, respectively. Highlights: A rule-based causal extraction method is designed by using word vector mapping. Four types of verbs are employed as seed sets to form causal extraction rules. Two datasets consisting of 3000 sentences are constructed by manually labeling. Extensive experiments are conducted to validate the causal extraction performance. Causal extraction from literature is used to help construct causal graphs. … (more)
- Is Part Of:
- Computers in biology and medicine. Volume 115(2019)
- Journal:
- Computers in biology and medicine
- Issue:
- Volume 115(2019)
- Issue Display:
- Volume 115, Issue 2019 (2019)
- Year:
- 2019
- Volume:
- 115
- Issue:
- 2019
- Issue Sort Value:
- 2019-0115-2019-0000
- Page Start:
- Page End:
- Publication Date:
- 2019-12
- Subjects:
- Causality -- Literature analysis -- Word vector -- Causal graph -- Causal extraction
Medicine -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
610.285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00104825/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiomed.2019.103524 ↗
- Languages:
- English
- ISSNs:
- 0010-4825
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.880000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12531.xml