Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment. Issue 1 (4th January 2019)
- Record Type:
- Journal Article
- Title:
- Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment. Issue 1 (4th January 2019)
- Main Title:
- Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment
- Authors:
- Banerjee, Imon
Li, Kevin
Seneviratne, Martin
Ferrari, Michelle
Seto, Tina
Brooks, James D
Rubin, Daniel L
Hernandez-Boussard, Tina - Abstract:
- Abstract: Background: The population-based assessment of patient-centered outcomes (PCOs) has been limited by the efficient and accurate collection of these data. Natural language processing (NLP) pipelines can determine whether a clinical note within an electronic medical record contains evidence on these data. We present and demonstrate the accuracy of an NLP pipeline that targets to assess the presence, absence, or risk discussion of two important PCOs following prostate cancer treatment: urinary incontinence (UI) and bowel dysfunction (BD). Methods: We propose a weakly supervised NLP approach which annotates electronic medical record clinical notes without requiring manual chart review. A weighted function of neural word embedding was used to create a sentence-level vector representation of relevant expressions extracted from the clinical notes. Sentence vectors were used as input for a multinomial logistic model, with output being either presence, absence or risk discussion of UI/BD. The classifier was trained based on automated sentence annotation depending only on domain-specific dictionaries (weak supervision). Results: The model achieved an average F1 score of 0.86 for the sentence-level, three-tier classification task (presence/absence/risk) in both UI and BD. The model also outperformed a pre-existing rule-based model for note-level annotation of UI with significant margin. Conclusions: We demonstrate a machine learning method to categorize clinical notes based onAbstract: Background: The population-based assessment of patient-centered outcomes (PCOs) has been limited by the efficient and accurate collection of these data. Natural language processing (NLP) pipelines can determine whether a clinical note within an electronic medical record contains evidence on these data. We present and demonstrate the accuracy of an NLP pipeline that targets to assess the presence, absence, or risk discussion of two important PCOs following prostate cancer treatment: urinary incontinence (UI) and bowel dysfunction (BD). Methods: We propose a weakly supervised NLP approach which annotates electronic medical record clinical notes without requiring manual chart review. A weighted function of neural word embedding was used to create a sentence-level vector representation of relevant expressions extracted from the clinical notes. Sentence vectors were used as input for a multinomial logistic model, with output being either presence, absence or risk discussion of UI/BD. The classifier was trained based on automated sentence annotation depending only on domain-specific dictionaries (weak supervision). Results: The model achieved an average F1 score of 0.86 for the sentence-level, three-tier classification task (presence/absence/risk) in both UI and BD. The model also outperformed a pre-existing rule-based model for note-level annotation of UI with significant margin. Conclusions: We demonstrate a machine learning method to categorize clinical notes based on important PCOs that trains a classifier on sentence vector representations labeled with a domain-specific dictionary, which eliminates the need for manual engineering of linguistic rules or manual chart review for extracting the PCOs. The weakly supervised NLP pipeline showed promising sensitivity and specificity for identifying important PCOs in unstructured clinical text notes compared to rule-based algorithms. Trial registration: This is a chart review study and approved by Institutional Review Board (IRB). … (more)
- Is Part Of:
- JAMIA open. Volume 2:Issue 1(2019)
- Journal:
- JAMIA open
- Issue:
- Volume 2:Issue 1(2019)
- Issue Display:
- Volume 2, Issue 1 (2019)
- Year:
- 2019
- Volume:
- 2
- Issue:
- 1
- Issue Sort Value:
- 2019-0002-0001-0000
- Page Start:
- 150
- Page End:
- 159
- Publication Date:
- 2019-01-04
- Subjects:
- natural language processing -- patient-centered outcomes -- prostate cancer -- neural word embedding -- text mining
Medical informatics -- Periodicals
610.285 - Journal URLs:
- http://www.oxfordjournals.org/ ↗
https://academic.oup.com/jamiaopen ↗ - DOI:
- 10.1093/jamiaopen/ooy057 ↗
- Languages:
- English
- ISSNs:
- 2574-2531
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12004.xml