A209 VALIDATION OF A NATURAL LANGUAGE PROCESSING ALGORITHM TO EXTRACT DATA FOR SYSTEM-LEVEL ADENOMA DETECTION RATE CALCULATION. (15th March 2019)
- Record Type:
- Journal Article
- Title:
- A209 VALIDATION OF A NATURAL LANGUAGE PROCESSING ALGORITHM TO EXTRACT DATA FOR SYSTEM-LEVEL ADENOMA DETECTION RATE CALCULATION. (15th March 2019)
- Main Title:
- A209 VALIDATION OF A NATURAL LANGUAGE PROCESSING ALGORITHM TO EXTRACT DATA FOR SYSTEM-LEVEL ADENOMA DETECTION RATE CALCULATION
- Authors:
- Morgan, D
Chorneyko, K
Swain, D
Bowes, B
Lee, V
Tinmouth, J - Abstract:
- Abstract: Background: Patients of endoscopists with lower adenoma detection rates (ADR) are more likely to die from missed colorectal cancers. Measuring ADR is challenging at the system level as pathology results are generally reported in unstructured electronic medical records. Natural language processing (NLP) can be used to extract relevant information from text-based records. Aims: At Cancer Care Ontario (CCO), we developed and validated a NLP algorithm to identify colorectal adenomas in unstructured electronic pathology reports available in CCO's Electronic Mapping Reporting and Coding (eMaRC) data. Methods: We identified pathology reports from colonoscopies in eMaRC as those with specimen type 'biopsy' and anatomic site 'colon'. The sampling period was restricted to 2015–16 and patients older than 50 years. From this sampling frame, two random samples of 450 and 1, 000 reports were selected as the test and validation sets. Expert clinicians reviewed and classified reports as adenoma or other. The test set was used to develop an NLP algorithm to identify adenomas using Base SAS 9.4. Statistical analyses, including sensitivity (recall), specificity, precision (positive predictive value) and F1 score of the NLP algorithm compared to clinician review were determined. Results: A significant proportion of Ontario colonoscopists, patologists and laboratories were represented in the examined validation sets. The sensitivity of the NLP algorithm was approximately 100% (95 %CI:Abstract: Background: Patients of endoscopists with lower adenoma detection rates (ADR) are more likely to die from missed colorectal cancers. Measuring ADR is challenging at the system level as pathology results are generally reported in unstructured electronic medical records. Natural language processing (NLP) can be used to extract relevant information from text-based records. Aims: At Cancer Care Ontario (CCO), we developed and validated a NLP algorithm to identify colorectal adenomas in unstructured electronic pathology reports available in CCO's Electronic Mapping Reporting and Coding (eMaRC) data. Methods: We identified pathology reports from colonoscopies in eMaRC as those with specimen type 'biopsy' and anatomic site 'colon'. The sampling period was restricted to 2015–16 and patients older than 50 years. From this sampling frame, two random samples of 450 and 1, 000 reports were selected as the test and validation sets. Expert clinicians reviewed and classified reports as adenoma or other. The test set was used to develop an NLP algorithm to identify adenomas using Base SAS 9.4. Statistical analyses, including sensitivity (recall), specificity, precision (positive predictive value) and F1 score of the NLP algorithm compared to clinician review were determined. Results: A significant proportion of Ontario colonoscopists, patologists and laboratories were represented in the examined validation sets. The sensitivity of the NLP algorithm was approximately 100% (95 %CI: 98.51–100) and 99.81% (95 %CI: 98.97–100) in the test and validation sets, respectively. Similarly, the specificity was 99.08% (95 %CI: 94.99–99.98) and 100% (95 %CI: 99.21–100). Conclusions: The CCO NLP algorithm was highly accurate in identifying colorectal adenomas in eMaRC data across many institutions in Ontario, . This lays the groundwork to measure ADR at the system-level in Ontario Funding Agencies: Cancer Care Ontario … (more)
- Is Part Of:
- Journal of the Canadian Association of Gastroenterology. Volume 2(2019)Supplement 2
- Journal:
- Journal of the Canadian Association of Gastroenterology
- Issue:
- Volume 2(2019)Supplement 2
- Issue Display:
- Volume 2, Issue 2 (2019)
- Year:
- 2019
- Volume:
- 2
- Issue:
- 2
- Issue Sort Value:
- 2019-0002-0002-0000
- Page Start:
- 409
- Page End:
- 409
- Publication Date:
- 2019-03-15
- Subjects:
- Gastroenterology -- Periodicals
616.33005 - Journal URLs:
- https://academic.oup.com/jcag ↗
http://www.oxfordjournals.org/ ↗ - DOI:
- 10.1093/jcag/gwz006.208 ↗
- Languages:
- English
- ISSNs:
- 2515-2084
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12043.xml