Development and assessment of a natural language processing model to identify residential instability in electronic health records' unstructured data: a comparison of 3 integrated healthcare delivery systems. Issue 1 (16th February 2022)
- Record Type:
- Journal Article
- Title:
- Development and assessment of a natural language processing model to identify residential instability in electronic health records' unstructured data: a comparison of 3 integrated healthcare delivery systems. Issue 1 (16th February 2022)
- Main Title:
- Development and assessment of a natural language processing model to identify residential instability in electronic health records' unstructured data: a comparison of 3 integrated healthcare delivery systems
- Authors:
- Hatef, Elham
Rouhizadeh, Masoud
Nau, Claudia
Xie, Fagen
Rouillard, Christopher
Abu-Nasser, Mahmoud
Padilla, Ariadna
Lyons, Lindsay Joe
Kharrazi, Hadi
Weiner, Jonathan P
Roblin, Douglas - Abstract:
- Abstract: Objective: To evaluate whether a natural language processing (NLP) algorithm could be adapted to extract, with acceptable validity, markers of residential instability (ie, homelessness and housing insecurity) from electronic health records (EHRs) of 3 healthcare systems. Materials and methods: We included patients 18 years and older who received care at 1 of 3 healthcare systems from 2016 through 2020 and had at least 1 free-text note in the EHR during this period. We conducted the study independently; the NLP algorithm logic and method of validity assessment were identical across sites. The approach to the development of the gold standard for assessment of validity differed across sites. Using the EntityRuler module of spaCy 2.3 Python toolkit, we created a rule-based NLP system made up of expert-developed patterns indicating residential instability at the lead site and enriched the NLP system using insight gained from its application at the other 2 sites. We adapted the algorithm at each site then validated the algorithm using a split-sample approach. We assessed the performance of the algorithm by measures of positive predictive value (precision), sensitivity (recall), and specificity. Results: The NLP algorithm performed with moderate precision (0.45, 0.73, and 1.0) at 3 sites. The sensitivity and specificity of the NLP algorithm varied across 3 sites (sensitivity: 0.68, 0.85, and 0.96; specificity: 0.69, 0.89, and 1.0). Discussion: The performance of this NLPAbstract: Objective: To evaluate whether a natural language processing (NLP) algorithm could be adapted to extract, with acceptable validity, markers of residential instability (ie, homelessness and housing insecurity) from electronic health records (EHRs) of 3 healthcare systems. Materials and methods: We included patients 18 years and older who received care at 1 of 3 healthcare systems from 2016 through 2020 and had at least 1 free-text note in the EHR during this period. We conducted the study independently; the NLP algorithm logic and method of validity assessment were identical across sites. The approach to the development of the gold standard for assessment of validity differed across sites. Using the EntityRuler module of spaCy 2.3 Python toolkit, we created a rule-based NLP system made up of expert-developed patterns indicating residential instability at the lead site and enriched the NLP system using insight gained from its application at the other 2 sites. We adapted the algorithm at each site then validated the algorithm using a split-sample approach. We assessed the performance of the algorithm by measures of positive predictive value (precision), sensitivity (recall), and specificity. Results: The NLP algorithm performed with moderate precision (0.45, 0.73, and 1.0) at 3 sites. The sensitivity and specificity of the NLP algorithm varied across 3 sites (sensitivity: 0.68, 0.85, and 0.96; specificity: 0.69, 0.89, and 1.0). Discussion: The performance of this NLP algorithm to identify residential instability in 3 different healthcare systems suggests the algorithm is generally valid and applicable in other healthcare systems with similar EHRs. Conclusion: The NLP approach developed in this project is adaptable and can be modified to extract types of social needs other than residential instability from EHRs across different healthcare systems. Lay Summary: We evaluated the performance of a natural language processing (NLP) algorithm to extract markers of residential instability (ie, homelessness and housing insecurity) from electronic health records (EHRs) of 3 healthcare systems. We included patients 18 years and older who received care at 1 of 3 healthcare systems from 2016 through 2020 and had at least 1 free-text note in the EHR during this period. We conducted the study independently; the NLP algorithm logic and method of validity assessment were identical across sites. The approach to the development of the gold standard for assessment of validity differed across sites. We created a rule-based NLP system made up of expert-developed patterns indicating residential instability at the lead site and enriched the NLP system using insight gained from its application at the other 2 sites. We assessed the performance of the algorithm by measures of positive predictive value (PPV), sensitivity, and specificity. The NLP algorithm performed with moderate PPV (0.45, 0.73, and 1.0), sensitivity (0.68, 0.85, and 0.96), and specificity (0.69, 0.89, and 1.0) across the 3 sites. Our findings suggest that this approach to extracting information on social needs from the free-text EHR notes is generally applicable in other healthcare systems with similar EHRs. … (more)
- Is Part Of:
- JAMIA open. Volume 5:Issue 1(2022)
- Journal:
- JAMIA open
- Issue:
- Volume 5:Issue 1(2022)
- Issue Display:
- Volume 5, Issue 1 (2022)
- Year:
- 2022
- Volume:
- 5
- Issue:
- 1
- Issue Sort Value:
- 2022-0005-0001-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-02-16
- Subjects:
- natural language processing -- electronic health record -- homelessness -- housing insecurity -- social determinants of health
Medical informatics -- Periodicals
610.285 - Journal URLs:
- http://www.oxfordjournals.org/ ↗
https://academic.oup.com/jamiaopen ↗ - DOI:
- 10.1093/jamiaopen/ooac006 ↗
- Languages:
- English
- ISSNs:
- 2574-2531
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20969.xml