Natural Language Processing of Large-Scale Structured Radiology Reports to Identify Oncologic Patients With or Without Splenomegaly Over a 10-Year Period. (6th January 2022)
- Record Type:
- Journal Article
- Title:
- Natural Language Processing of Large-Scale Structured Radiology Reports to Identify Oncologic Patients With or Without Splenomegaly Over a 10-Year Period. (6th January 2022)
- Main Title:
- Natural Language Processing of Large-Scale Structured Radiology Reports to Identify Oncologic Patients With or Without Splenomegaly Over a 10-Year Period
- Authors:
- Sun, Simon
Lupton, Kaelan
Batch, Karen
Nguyen, Huy
Gazit, Lior
Gangai, Natalie
Cho, Jessica
Nicholas, Kevin
Zulkernine, Farhana
Sevilimedu, Varadan
Simpson, Amber
Do, Richard K. G. - Abstract:
- Abstract : PURPOSE: To assess the accuracy of a natural language processing (NLP) model in extracting splenomegaly described in patients with cancer in structured computed tomography radiology reports. METHODS: In this retrospective study between July 2009 and April 2019, 3, 87, 359 consecutive structured radiology reports for computed tomography scans of the chest, abdomen, and pelvis from 91, 665 patients spanning 30 types of cancer were included. A randomized sample of 2, 022 reports from patients with colorectal cancer, hepatobiliary cancer (HB), leukemia, Hodgkin lymphoma (HL), and non-HL patients was manually annotated as positive or negative for splenomegaly. NLP model training/testing was performed on 1, 617/405 reports, and a new validation set of 400 reports from all cancer subtypes was used to test NLP model accuracy, precision, and recall. Overall survival was compared between the patient groups (with and without splenomegaly) using Kaplan-Meier curves. RESULTS: The final cohort included 3, 87, 359 reports from 91, 665 patients (mean age 60.8 years; 51.2% women). In the testing set, the model achieved accuracy of 92.1%, precision of 92.2%, and recall of 92.1% for splenomegaly. In the validation set, accuracy, precision, and recall were 93.8%, 92.9%, and 86.7%, respectively. In the entire cohort, splenomegaly was most frequent in patients with leukemia (32.5%), HB (17.4%), non-HL (9.1%), colorectal cancer (8.5%), and HL (5.6%). A splenomegaly label was associatedAbstract : PURPOSE: To assess the accuracy of a natural language processing (NLP) model in extracting splenomegaly described in patients with cancer in structured computed tomography radiology reports. METHODS: In this retrospective study between July 2009 and April 2019, 3, 87, 359 consecutive structured radiology reports for computed tomography scans of the chest, abdomen, and pelvis from 91, 665 patients spanning 30 types of cancer were included. A randomized sample of 2, 022 reports from patients with colorectal cancer, hepatobiliary cancer (HB), leukemia, Hodgkin lymphoma (HL), and non-HL patients was manually annotated as positive or negative for splenomegaly. NLP model training/testing was performed on 1, 617/405 reports, and a new validation set of 400 reports from all cancer subtypes was used to test NLP model accuracy, precision, and recall. Overall survival was compared between the patient groups (with and without splenomegaly) using Kaplan-Meier curves. RESULTS: The final cohort included 3, 87, 359 reports from 91, 665 patients (mean age 60.8 years; 51.2% women). In the testing set, the model achieved accuracy of 92.1%, precision of 92.2%, and recall of 92.1% for splenomegaly. In the validation set, accuracy, precision, and recall were 93.8%, 92.9%, and 86.7%, respectively. In the entire cohort, splenomegaly was most frequent in patients with leukemia (32.5%), HB (17.4%), non-HL (9.1%), colorectal cancer (8.5%), and HL (5.6%). A splenomegaly label was associated with an increased risk of mortality in the entire cohort (hazard ratio 2.10; 95% CI, 1.98 to 2.22; P < .001). CONCLUSION: Automated splenomegaly labeling by NLP of radiology report demonstrates good accuracy, precision, and recall. Splenomegaly is most frequently reported in patients with leukemia, followed by patients with HB. Abstract : … (more)
- Is Part Of:
- JCO Clinical Cancer Informatics. Volume 6(2022)
- Journal:
- JCO Clinical Cancer Informatics
- Issue:
- Volume 6(2022)
- Issue Display:
- Volume 6, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 6
- Issue:
- 2022
- Issue Sort Value:
- 2022-0006-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-01-06
- Subjects:
- 616.994
- Journal URLs:
- http://journals.lww.com/pages/default.aspx ↗
- DOI:
- 10.1200/CCI.21.00104 ↗
- Languages:
- English
- ISSNs:
- 2473-4276
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20839.xml