Assessing the quality of clinical and administrative data extracted from hospitals: the General Medicine Inpatient Initiative (GEMINI) experience. (4th November 2020)
- Record Type:
- Journal Article
- Title:
- Assessing the quality of clinical and administrative data extracted from hospitals: the General Medicine Inpatient Initiative (GEMINI) experience. (4th November 2020)
- Main Title:
- Assessing the quality of clinical and administrative data extracted from hospitals: the General Medicine Inpatient Initiative (GEMINI) experience
- Authors:
- Verma, Amol A
Pasricha, Sachin V
Jung, Hae Young
Kushnir, Vladyslav
Mak, Denise Y F
Koppula, Radha
Guo, Yishan
Kwan, Janice L
Lapointe-Shaw, Lauren
Rawal, Shail
Tang, Terence
Weinerman, Adina
Razak, Fahad - Abstract:
- Abstract: Objective: Large clinical databases are increasingly used for research and quality improvement. We describe an approach to data quality assessment from the General Medicine Inpatient Initiative (GEMINI), which collects and standardizes administrative and clinical data from hospitals. Methods: The GEMINI database contained 245 559 patient admissions at 7 hospitals in Ontario, Canada from 2010 to 2017. We performed 7 computational data quality checks and iteratively re-extracted data from hospitals to correct problems. Thereafter, GEMINI data were compared to data that were manually abstracted from the hospital's electronic medical record for 23 419 selected data points on a sample of 7488 patients. Results: Computational checks flagged 103 potential data quality issues, which were either corrected or documented to inform future analysis. For example, we identified the inclusion of canceled radiology tests, a time shift of transfusion data, and mistakenly processing the chemical symbol for sodium ("Na") as a missing value. Manual validation identified 1 important data quality issue that was not detected by computational checks: transfusion dates and times at 1 site were unreliable. Apart from that single issue, across all data tables, GEMINI data had high overall accuracy (ranging from 98%–100%), sensitivity (95%–100%), specificity (99%–100%), positive predictive value (93%–100%), and negative predictive value (99%–100%) compared to the gold standard. Discussion andAbstract: Objective: Large clinical databases are increasingly used for research and quality improvement. We describe an approach to data quality assessment from the General Medicine Inpatient Initiative (GEMINI), which collects and standardizes administrative and clinical data from hospitals. Methods: The GEMINI database contained 245 559 patient admissions at 7 hospitals in Ontario, Canada from 2010 to 2017. We performed 7 computational data quality checks and iteratively re-extracted data from hospitals to correct problems. Thereafter, GEMINI data were compared to data that were manually abstracted from the hospital's electronic medical record for 23 419 selected data points on a sample of 7488 patients. Results: Computational checks flagged 103 potential data quality issues, which were either corrected or documented to inform future analysis. For example, we identified the inclusion of canceled radiology tests, a time shift of transfusion data, and mistakenly processing the chemical symbol for sodium ("Na") as a missing value. Manual validation identified 1 important data quality issue that was not detected by computational checks: transfusion dates and times at 1 site were unreliable. Apart from that single issue, across all data tables, GEMINI data had high overall accuracy (ranging from 98%–100%), sensitivity (95%–100%), specificity (99%–100%), positive predictive value (93%–100%), and negative predictive value (99%–100%) compared to the gold standard. Discussion and Conclusion: Computational data quality checks with iterative re-extraction facilitated reliable data collection from hospitals but missed 1 critical quality issue. Combining computational and manual approaches may be optimal for assessing the quality of large multisite clinical databases. … (more)
- Is Part Of:
- Journal of the American Medical Informatics Association. Volume 28:Number 3(2021)
- Journal:
- Journal of the American Medical Informatics Association
- Issue:
- Volume 28:Number 3(2021)
- Issue Display:
- Volume 28, Issue 3 (2021)
- Year:
- 2021
- Volume:
- 28
- Issue:
- 3
- Issue Sort Value:
- 2021-0028-0003-0000
- Page Start:
- 578
- Page End:
- 587
- Publication Date:
- 2020-11-04
- Subjects:
- data quality -- data validation -- clinical databases -- data extraction, electronic medical records
Medical informatics -- Periodicals
Information Services -- Periodicals
Medical Informatics -- Periodicals
Médecine -- Informatique -- Périodiques
Informatica
Geneeskunde
Informatique médicale
Computer network resources
Electronic journals
610.285 - Journal URLs:
- http://jamia.bmj.com/ ↗
http://www.jamia.org ↗
http://www.pubmedcentral.nih.gov/tocrender.fcgi?journal=76 ↗
http://www.sciencedirect.com/science/journal/10675027 ↗
http://jamia.oxfordjournals.org/ ↗
http://www.oxfordjournals.org/en/ ↗ - DOI:
- 10.1093/jamia/ocaa225 ↗
- Languages:
- English
- ISSNs:
- 1067-5027
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4689.025000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 17112.xml