Active learning reduces annotation time for clinical concept extraction. (October 2017)
- Record Type:
- Journal Article
- Title:
- Active learning reduces annotation time for clinical concept extraction. (October 2017)
- Main Title:
- Active learning reduces annotation time for clinical concept extraction
- Authors:
- Kholghi, Mahnoosh
Sitbon, Laurianne
Zuccon, Guido
Nguyen, Anthony - Abstract:
- Highlights: Active learning significantly reduces the manual annotation time of clinical concepts. The number of target concepts that requires manual annotation provides a good estimation of the annotation time. AL-assisted pre-annotations further reduce annotation time compared to de novo annotation. A smart seed set leads to high quality pre-annotations compared to a random seed set. Abstract: Objective: To investigate: (1) the annotation time savings by various active learning query strategies compared to supervised learning and a random sampling baseline, and (2) the benefits of active learning-assisted pre-annotations in accelerating the manual annotation process compared to de novo annotation. Materials and methods: There are 73 and 120 discharge summary reports provided by Beth Israel institute in the train and test sets of the concept extraction task in the i2b2/VA 2010 challenge, respectively. The 73 reports were used in user study experiments for manual annotation. First, all sequences within the 73 reports were manually annotated from scratch. Next, active learning models were built to generate pre-annotations for the sequences selected by a query strategy. The annotation/reviewing time per sequence was recorded. The 120 test reports were used to measure the effectiveness of the active learning models. Results: When annotating from scratch, active learning reduced the annotation time up to 35% and 28% compared to a fully supervised approach and a random samplingHighlights: Active learning significantly reduces the manual annotation time of clinical concepts. The number of target concepts that requires manual annotation provides a good estimation of the annotation time. AL-assisted pre-annotations further reduce annotation time compared to de novo annotation. A smart seed set leads to high quality pre-annotations compared to a random seed set. Abstract: Objective: To investigate: (1) the annotation time savings by various active learning query strategies compared to supervised learning and a random sampling baseline, and (2) the benefits of active learning-assisted pre-annotations in accelerating the manual annotation process compared to de novo annotation. Materials and methods: There are 73 and 120 discharge summary reports provided by Beth Israel institute in the train and test sets of the concept extraction task in the i2b2/VA 2010 challenge, respectively. The 73 reports were used in user study experiments for manual annotation. First, all sequences within the 73 reports were manually annotated from scratch. Next, active learning models were built to generate pre-annotations for the sequences selected by a query strategy. The annotation/reviewing time per sequence was recorded. The 120 test reports were used to measure the effectiveness of the active learning models. Results: When annotating from scratch, active learning reduced the annotation time up to 35% and 28% compared to a fully supervised approach and a random sampling baseline, respectively. Reviewing active learning-assisted pre-annotations resulted in 20% further reduction of the annotation time when compared to de novo annotation. Discussion: The number of concepts that require manual annotation is a good indicator of the annotation time for various active learning approaches as demonstrated by high correlation between time rate and concept annotation rate. Conclusion: Active learning has a key role in reducing the time required to manually annotate domain concepts from clinical free text, either when annotating from scratch or reviewing active learning-assisted pre-annotations. … (more)
- Is Part Of:
- International journal of medical informatics. Volume 106(2017)
- Journal:
- International journal of medical informatics
- Issue:
- Volume 106(2017)
- Issue Display:
- Volume 106, Issue 2017 (2017)
- Year:
- 2017
- Volume:
- 106
- Issue:
- 2017
- Issue Sort Value:
- 2017-0106-2017-0000
- Page Start:
- 25
- Page End:
- 31
- Publication Date:
- 2017-10
- Subjects:
- Active learning -- Annotation time -- Machine-assisted pre-annotation -- Concept extraction -- Clinical free text
Medical informatics -- Periodicals
Information science -- Periodicals
Computers -- Periodicals
Medical technology -- Periodicals
Medical Informatics -- Periodicals
Technology, Medical -- Periodicals
Computers
Information science
Medical informatics
Medical technology
Electronic journals
Periodicals
Electronic journals
610.285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/13865056 ↗
http://www.clinicalkey.com/dura/browse/journalIssue/13865056 ↗
http://www.clinicalkey.com.au/dura/browse/journalIssue/13865056 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.ijmedinf.2017.08.001 ↗
- Languages:
- English
- ISSNs:
- 1386-5056
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4542.345250
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 4651.xml