Automated ICD coding via unsupervised knowledge integration (UNITE). (July 2020)
- Record Type:
- Journal Article
- Title:
- Automated ICD coding via unsupervised knowledge integration (UNITE). (July 2020)
- Main Title:
- Automated ICD coding via unsupervised knowledge integration (UNITE)
- Authors:
- Sonabend W, Aaron
Cai, Winston
Ahuja, Yuri
Ananthakrishnan, Ashwin
Xia, Zongqi
Yu, Sheng
Hong, Chuan - Abstract:
- Highlights: Unsupervised ICD coding without requiring human labor. Analyzing clinical narrative notes via semantic relevance assessment. Stable performance and high portability across EMRs in different institutions. Abstract: Objective: Accurate coding is critical for medical billing and electronic medical record (EMR)-based research. Recent research has been focused on developing supervised methods to automatically assign International Classification of Diseases (ICD) codes from clinical notes. However, supervised approaches rely on ICD code data stored in the hospital EMR system and is subject to bias rising from the practice and coding behavior. Consequently, portability of trained supervised algorithms to external EMR systems may suffer. Method: We developed an unsupervised knowledge integration (UNITE) algorithm to automatically assign ICD codes for a specific disease by analyzing clinical narrative notes via semantic relevance assessment. The algorithm was validated using coded ICD data for 6 diseases from Partners HealthCare (PHS) Biobank and Medical Information Mart for Intensive Care (MIMIC-III). We compared the performance of UNITE against penalized logistic regression (LR), topic modeling, and neural network models within each EMR system. We additionally evaluated the portability of UNITE by training at PHS Biobank and validating at MIMIC-III, and vice versa. Results: UNITE achieved an averaged AUC of 0.91 at PHS and 0.92 at MIMIC over 6 diseases, comparable to LRHighlights: Unsupervised ICD coding without requiring human labor. Analyzing clinical narrative notes via semantic relevance assessment. Stable performance and high portability across EMRs in different institutions. Abstract: Objective: Accurate coding is critical for medical billing and electronic medical record (EMR)-based research. Recent research has been focused on developing supervised methods to automatically assign International Classification of Diseases (ICD) codes from clinical notes. However, supervised approaches rely on ICD code data stored in the hospital EMR system and is subject to bias rising from the practice and coding behavior. Consequently, portability of trained supervised algorithms to external EMR systems may suffer. Method: We developed an unsupervised knowledge integration (UNITE) algorithm to automatically assign ICD codes for a specific disease by analyzing clinical narrative notes via semantic relevance assessment. The algorithm was validated using coded ICD data for 6 diseases from Partners HealthCare (PHS) Biobank and Medical Information Mart for Intensive Care (MIMIC-III). We compared the performance of UNITE against penalized logistic regression (LR), topic modeling, and neural network models within each EMR system. We additionally evaluated the portability of UNITE by training at PHS Biobank and validating at MIMIC-III, and vice versa. Results: UNITE achieved an averaged AUC of 0.91 at PHS and 0.92 at MIMIC over 6 diseases, comparable to LR and MLP. It had substantially better performance than topic models. In regards to portability, the performance of UNITE was consistent across different EMR systems, superior to LR, topic models and neural network models. Conclusion: UNITE accurately assigns ICD code in EMR without requiring human labor, and has major advantages over commonly used machine learning approaches. In addition, the UNITE attained stable performance and high portability across EMRs in different institutions. … (more)
- Is Part Of:
- International journal of medical informatics. Volume 139(2020)
- Journal:
- International journal of medical informatics
- Issue:
- Volume 139(2020)
- Issue Display:
- Volume 139, Issue 2020 (2020)
- Year:
- 2020
- Volume:
- 139
- Issue:
- 2020
- Issue Sort Value:
- 2020-0139-2020-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-07
- Subjects:
- Automated ICD assignment -- Electronic medical records -- Portability -- Semantic embedding -- Knowledge integration -- Unsupervised learning
Medical informatics -- Periodicals
Information science -- Periodicals
Computers -- Periodicals
Medical technology -- Periodicals
Medical Informatics -- Periodicals
Technology, Medical -- Periodicals
Computers
Information science
Medical informatics
Medical technology
Electronic journals
Periodicals
Electronic journals
610.285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/13865056 ↗
http://www.clinicalkey.com/dura/browse/journalIssue/13865056 ↗
http://www.clinicalkey.com.au/dura/browse/journalIssue/13865056 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.ijmedinf.2020.104135 ↗
- Languages:
- English
- ISSNs:
- 1386-5056
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4542.345250
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 14587.xml