Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions. (30th November 2017)
- Record Type:
- Journal Article
- Title:
- Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions. (30th November 2017)
- Main Title:
- Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions
- Authors:
- Sohn, Sunghwan
Wang, Yanshan
Wi, Chung-Il
Krusemark, Elizabeth A
Ryu, Euijung
Ali, Mir H
Juhn, Young J
Liu, Hongfang - Abstract:
- Abstract: Objective: To assess clinical documentation variations across health care institutions using different electronic medical record systems and investigate how they affect natural language processing (NLP) system portability. Materials and Methods: Birth cohorts from Mayo Clinic and Sanford Children's Hospital (SCH) were used in this study ( n = 298 for each). Documentation variations regarding asthma between the 2 cohorts were examined in various aspects: (1) overall corpus at the word level (ie, lexical variation), (2) topics and asthma-related concepts (ie, semantic variation), and (3) clinical note types (ie, process variation). We compared those statistics and explored NLP system portability for asthma ascertainment in 2 stages: prototype and refinement. Results: There exist notable lexical variations (word-level similarity = 0.669) and process variations (differences in major note types containing asthma-related concepts). However, semantic-level corpora were relatively homogeneous (topic similarity = 0.944, asthma-related concept similarity = 0.971). The NLP system for asthma ascertainment had an F -score of 0.937 at Mayo, and produced 0.813 (prototype) and 0.908 (refinement) when applied at SCH. Discussion: The criteria for asthma ascertainment are largely dependent on asthma-related concepts. Therefore, we believe that semantic similarity is important to estimate NLP system portability. As the Mayo Clinic and SCH corpora were relatively homogeneous at aAbstract: Objective: To assess clinical documentation variations across health care institutions using different electronic medical record systems and investigate how they affect natural language processing (NLP) system portability. Materials and Methods: Birth cohorts from Mayo Clinic and Sanford Children's Hospital (SCH) were used in this study ( n = 298 for each). Documentation variations regarding asthma between the 2 cohorts were examined in various aspects: (1) overall corpus at the word level (ie, lexical variation), (2) topics and asthma-related concepts (ie, semantic variation), and (3) clinical note types (ie, process variation). We compared those statistics and explored NLP system portability for asthma ascertainment in 2 stages: prototype and refinement. Results: There exist notable lexical variations (word-level similarity = 0.669) and process variations (differences in major note types containing asthma-related concepts). However, semantic-level corpora were relatively homogeneous (topic similarity = 0.944, asthma-related concept similarity = 0.971). The NLP system for asthma ascertainment had an F -score of 0.937 at Mayo, and produced 0.813 (prototype) and 0.908 (refinement) when applied at SCH. Discussion: The criteria for asthma ascertainment are largely dependent on asthma-related concepts. Therefore, we believe that semantic similarity is important to estimate NLP system portability. As the Mayo Clinic and SCH corpora were relatively homogeneous at a semantic level, the NLP system, developed at Mayo Clinic, was imported to SCH successfully with proper adjustments to deal with the intrinsic corpus heterogeneity. … (more)
- Is Part Of:
- Journal of the American Medical Informatics Association. Volume 25:Number 3(2018)
- Journal:
- Journal of the American Medical Informatics Association
- Issue:
- Volume 25:Number 3(2018)
- Issue Display:
- Volume 25, Issue 3 (2018)
- Year:
- 2018
- Volume:
- 25
- Issue:
- 3
- Issue Sort Value:
- 2018-0025-0003-0000
- Page Start:
- 353
- Page End:
- 359
- Publication Date:
- 2017-11-30
- Subjects:
- documentation variation -- natural language processing -- portability -- asthma -- electronic medical records
Medical informatics -- Periodicals
Information Services -- Periodicals
Medical Informatics -- Periodicals
Médecine -- Informatique -- Périodiques
Informatica
Geneeskunde
Informatique médicale
Computer network resources
Electronic journals
610.285 - Journal URLs:
- http://jamia.bmj.com/ ↗
http://www.jamia.org ↗
http://www.pubmedcentral.nih.gov/tocrender.fcgi?journal=76 ↗
http://www.sciencedirect.com/science/journal/10675027 ↗
http://jamia.oxfordjournals.org/ ↗
http://www.oxfordjournals.org/en/ ↗ - DOI:
- 10.1093/jamia/ocx138 ↗
- Languages:
- English
- ISSNs:
- 1067-5027
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4689.025000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 15148.xml