University of California, Irvine–Pathology Extraction Pipeline: The pathology extraction pipeline for information extraction from pathology reports. (December 2014)
- Record Type:
- Journal Article
- Title:
- University of California, Irvine–Pathology Extraction Pipeline: The pathology extraction pipeline for information extraction from pathology reports. (December 2014)
- Main Title:
- University of California, Irvine–Pathology Extraction Pipeline: The pathology extraction pipeline for information extraction from pathology reports
- Authors:
- Ashish, Naveen
Dahm, Lisa
Boicey, Charles - Abstract:
- We describe Pathology Extraction Pipeline (PEP)—a new Open Health Natural Language Processing pipeline that we have developed for information extraction from pathology reports, with the goal of populating the extracted data into a research data warehouse. Specifically, we have built upon Medical Knowledge Analysis Tool pipeline (MedKATp), which is an extraction framework focused on pathology reports. Our particular contributions include additional customization and development on MedKATp to extract data elements and relationships from cancer pathology reports in richer detail than at present, an abstraction layer that provides significantly easier configuration of MedKATp for extraction tasks, and a machine-learning-based approach that makes the extraction more resilient to deviations from the common reporting format in a pathology reports corpus. We present experimental results demonstrating the effectiveness of our pipeline for information extraction in a real-world task, demonstrating performance improvement due to our approach for increasing extractor resilience to format deviation, and finally demonstrating the scalability of the pipeline across pathology reports for different cancer types.
- Is Part Of:
- Health informatics journal. Volume 20:Number 4(2014:Dec.)
- Journal:
- Health informatics journal
- Issue:
- Volume 20:Number 4(2014:Dec.)
- Issue Display:
- Volume 20, Issue 4 (2014)
- Year:
- 2014
- Volume:
- 20
- Issue:
- 4
- Issue Sort Value:
- 2014-0020-0004-0000
- Page Start:
- 288
- Page End:
- 305
- Publication Date:
- 2014-12
- Subjects:
- Clinical decision-making -- databases and data mining -- decision-support systems -- evidence-based practice -- information and knowledge management
Medical informatics -- Periodicals
610.285 - Journal URLs:
- http://jhi.sagepub.com/ ↗
http://www.uk.sagepub.com/home.nav ↗ - DOI:
- 10.1177/1460458213494032 ↗
- Languages:
- English
- ISSNs:
- 1460-4582
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 6132.xml