An end-to-end joint model for evidence information extraction from court record document. Issue 6 (November 2020)
- Record Type:
- Journal Article
- Title:
- An end-to-end joint model for evidence information extraction from court record document. Issue 6 (November 2020)
- Main Title:
- An end-to-end joint model for evidence information extraction from court record document
- Authors:
- Ji, Donghong
Tao, Peng
Fei, Hao
Ren, Yafeng - Abstract:
- Highlights: We explore a new task of information extraction, which aims to extract evidence information from count record documents in the legal domain. We formalize evidence information extraction task as an integration of paragraph classification and sequence labeling problem, and propose an end-to-end joint model for the task. Our model achieves the current best performances, outperforming previous methods and neural baseline systems by a large margin. The proposed model can be applied for better analyzing and understanding legal texts, avoiding a lot of manual work for experts and professionals in the legal domain. Abstract: Information extraction is one of the important tasks in the field of Natural Language Processing (NLP). Most of the existing methods focus on general texts and little attention is paid to information extraction in specialized domains such as legal texts. This paper explores the task of information extraction in the legal field, which aims to extract evidence information from court record documents (CRDs). In the general domain, entities and relations are mostly words and phrases, indicating that they do not span multiple sentences. In contrast, evidence information in CRDs may span multiple sentences, while existing models cannot handle this situation. To address this issue, we first add a classification task in addition to the extraction task. We then formulate the two tasks as a multi-task learning problem and present a novel end-to-end model toHighlights: We explore a new task of information extraction, which aims to extract evidence information from count record documents in the legal domain. We formalize evidence information extraction task as an integration of paragraph classification and sequence labeling problem, and propose an end-to-end joint model for the task. Our model achieves the current best performances, outperforming previous methods and neural baseline systems by a large margin. The proposed model can be applied for better analyzing and understanding legal texts, avoiding a lot of manual work for experts and professionals in the legal domain. Abstract: Information extraction is one of the important tasks in the field of Natural Language Processing (NLP). Most of the existing methods focus on general texts and little attention is paid to information extraction in specialized domains such as legal texts. This paper explores the task of information extraction in the legal field, which aims to extract evidence information from court record documents (CRDs). In the general domain, entities and relations are mostly words and phrases, indicating that they do not span multiple sentences. In contrast, evidence information in CRDs may span multiple sentences, while existing models cannot handle this situation. To address this issue, we first add a classification task in addition to the extraction task. We then formulate the two tasks as a multi-task learning problem and present a novel end-to-end model to jointly address the two tasks. The joint model adopts a shared encoder followed by separate decoders for the two tasks. The experimental results on the dataset show the effectiveness of the proposed model, which can obtain 72.36% F1 score, outperforming previous methods and strong baselines by a large margin. … (more)
- Is Part Of:
- Information processing & management. Volume 57:Issue 6(2020:Nov.)
- Journal:
- Information processing & management
- Issue:
- Volume 57:Issue 6(2020:Nov.)
- Issue Display:
- Volume 57, Issue 6 (2020)
- Year:
- 2020
- Volume:
- 57
- Issue:
- 6
- Issue Sort Value:
- 2020-0057-0006-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-11
- Subjects:
- Natural language processing -- Information extraction -- Court record document -- Neural networks -- Joint model
Information storage and retrieval systems -- Periodicals
Information science -- Periodicals
Systèmes d'information -- Périodiques
Sciences de l'information -- Périodiques
Information science
Information storage and retrieval systems
Periodicals
658.4038 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03064573 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.ipm.2020.102305 ↗
- Languages:
- English
- ISSNs:
- 0306-4573
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4493.893000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 14754.xml