A small samples training framework for deep Learning-based automatic information extraction: Case study of construction accident news reports analysis. (January 2021)
- Record Type:
- Journal Article
- Title:
- A small samples training framework for deep Learning-based automatic information extraction: Case study of construction accident news reports analysis. (January 2021)
- Main Title:
- A small samples training framework for deep Learning-based automatic information extraction: Case study of construction accident news reports analysis
- Authors:
- Feng, Dan
Chen, Hainan - Abstract:
- Abstract: Knowledge management is crucial for construction safety management. Widely collected and well-organized safety-related documents are recognized to be significant in raising the workers' security awareness and then to prevent hazards and accidents. To improve document processing efficiency, automatic information extraction plays an important role. However, currently, automatic information extraction modeling requires large scale training datasets. It is a big challenge for the engineering industry, especially for the fields which heavily rely on the experts' knowledge. Limited data sources, and high time and labor costs make it not practical to establish a large-scale dataset. This work proposed a natural language data augmentation-based small samples training framework for automatic information extraction modeling. With the designed cross combination-based text data augmentation algorithm, the deep neural network can be employed to build up automatic information extraction models without large-scale raw data and manual annotations. Characters semantic coding is employed to avoid word segmentation and make sure that the framework can be utilized in different writing language systems. The BiLSTM-CRF model is adopted as the detection core to conduct character classification. Through a case study of two independent accident news report datasets analysis, the proposed framework has been validated. A reliable and robust automatic information extraction model can beAbstract: Knowledge management is crucial for construction safety management. Widely collected and well-organized safety-related documents are recognized to be significant in raising the workers' security awareness and then to prevent hazards and accidents. To improve document processing efficiency, automatic information extraction plays an important role. However, currently, automatic information extraction modeling requires large scale training datasets. It is a big challenge for the engineering industry, especially for the fields which heavily rely on the experts' knowledge. Limited data sources, and high time and labor costs make it not practical to establish a large-scale dataset. This work proposed a natural language data augmentation-based small samples training framework for automatic information extraction modeling. With the designed cross combination-based text data augmentation algorithm, the deep neural network can be employed to build up automatic information extraction models without large-scale raw data and manual annotations. Characters semantic coding is employed to avoid word segmentation and make sure that the framework can be utilized in different writing language systems. The BiLSTM-CRF model is adopted as the detection core to conduct character classification. Through a case study of two independent accident news report datasets analysis, the proposed framework has been validated. A reliable and robust automatic information extraction model can be established, even though with small samples training. … (more)
- Is Part Of:
- Advanced engineering informatics. Volume 47(2021)
- Journal:
- Advanced engineering informatics
- Issue:
- Volume 47(2021)
- Issue Display:
- Volume 47, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 47
- Issue:
- 2021
- Issue Sort Value:
- 2021-0047-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-01
- Subjects:
- Automatic information extraction -- Small sample training -- Cross combination-based text augmentation -- Construction accident news reports
Computer-aided engineering -- Periodicals
Engineering -- Data processing -- Periodicals
620.00285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/14740346 ↗
http://books.google.com/books?id=KhFVAAAAMAAJ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.aei.2021.101256 ↗
- Languages:
- English
- ISSNs:
- 1474-0346
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 0696.851100
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 15850.xml