Bridge inspection named entity recognition via BERT and lexicon augmented machine reading comprehension neural model. (October 2021)
- Record Type:
- Journal Article
- Title:
- Bridge inspection named entity recognition via BERT and lexicon augmented machine reading comprehension neural model. (October 2021)
- Main Title:
- Bridge inspection named entity recognition via BERT and lexicon augmented machine reading comprehension neural model
- Authors:
- Li, Ren
Mo, Tianjin
Yang, Jianxi
Li, Dong
Jiang, Shixin
Wang, Di - Abstract:
- Abstract: As an important data source in the field of bridge management, bridge inspection reports contain large-scale fine-grained data, including information on bridge members and structural defects. However, due to insufficient research on automatic information extraction in this field, valuable bridge inspection information has not been fully utilized. Particularly, for Chinese bridge inspection entities, which involve domain-specific vocabularies and have obvious nesting characteristics, most of the existing named entity recognition (NER) solutions are not suitable. To address this problem, this paper proposes a novel lexicon augmented machine reading comprehension-based NER neural model for identifying flat and nested entities from Chinese bridge inspection text. The proposed model uses the bridge inspection text and predefined question queries as input to enhance the ability of contextual feature representation and to integrate prior knowledge. Based on the character-level features encoded by the pre-trained BERT model, bigram embeddings and weighted lexicon features are further combined into a context representation. Then, the bidirectional long short-term memory neural network is used to extract sequence features before predicting the spans of named entities. The proposed model is verified by the Chinese bridge inspection named entity corpus. The experimental results show that the proposed model outperforms other mainstream NER models on the bridge inspectionAbstract: As an important data source in the field of bridge management, bridge inspection reports contain large-scale fine-grained data, including information on bridge members and structural defects. However, due to insufficient research on automatic information extraction in this field, valuable bridge inspection information has not been fully utilized. Particularly, for Chinese bridge inspection entities, which involve domain-specific vocabularies and have obvious nesting characteristics, most of the existing named entity recognition (NER) solutions are not suitable. To address this problem, this paper proposes a novel lexicon augmented machine reading comprehension-based NER neural model for identifying flat and nested entities from Chinese bridge inspection text. The proposed model uses the bridge inspection text and predefined question queries as input to enhance the ability of contextual feature representation and to integrate prior knowledge. Based on the character-level features encoded by the pre-trained BERT model, bigram embeddings and weighted lexicon features are further combined into a context representation. Then, the bidirectional long short-term memory neural network is used to extract sequence features before predicting the spans of named entities. The proposed model is verified by the Chinese bridge inspection named entity corpus. The experimental results show that the proposed model outperforms other mainstream NER models on the bridge inspection corpus. The proposed model not only provides a basis for automatic bridge inspection information extraction but also supports the downstream tasks such as knowledge graph construction and question answering systems. … (more)
- Is Part Of:
- Advanced engineering informatics. Volume 50(2021)
- Journal:
- Advanced engineering informatics
- Issue:
- Volume 50(2021)
- Issue Display:
- Volume 50, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 50
- Issue:
- 2021
- Issue Sort Value:
- 2021-0050-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-10
- Subjects:
- Bridge inspection -- Named entity recognition -- Machine reading comprehension -- BERT
Computer-aided engineering -- Periodicals
Engineering -- Data processing -- Periodicals
620.00285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/14740346 ↗
http://books.google.com/books?id=KhFVAAAAMAAJ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.aei.2021.101416 ↗
- Languages:
- English
- ISSNs:
- 1474-0346
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 0696.851100
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 19711.xml