Biases introduced by filtering electronic health records for patients with "complete data". (1st August 2017)
- Record Type:
- Journal Article
- Title:
- Biases introduced by filtering electronic health records for patients with "complete data". (1st August 2017)
- Main Title:
- Biases introduced by filtering electronic health records for patients with "complete data"
- Authors:
- Weber, Griffin M
Adams, William G
Bernstam, Elmer V
Bickel, Jonathan P
Fox, Kathe P
Marsolo, Keith
Raghavan, Vijay A
Turchin, Alexander
Zhou, Xiaobo
Murphy, Shawn N
Mandl, Kenneth D - Abstract:
- Abstract: Objective: One promise of nationwide adoption of electronic health records (EHRs) is the availability of data for large-scale clinical research studies. However, because the same patient could be treated at multiple health care institutions, data from only a single site might not contain the complete medical history for that patient, meaning that critical events could be missing. In this study, we evaluate how simple heuristic checks for data "completeness" affect the number of patients in the resulting cohort and introduce potential biases. Materials and Methods: We began with a set of 16 filters that check for the presence of demographics, laboratory tests, and other types of data, and then systematically applied all 2 16 possible combinations of these filters to the EHR data for 12 million patients at 7 health care systems and a separate payor claims database of 7 million members. Results: EHR data showed considerable variability in data completeness across sites and high correlation between data types. For example, the fraction of patients with diagnoses increased from 35.0% in all patients to 90.9% in those with at least 1 medication. An unrelated claims dataset independently showed that most filters select members who are older and more likely female and can eliminate large portions of the population whose data are actually complete. Discussion and Conclusion: As investigators design studies, they need to balance their confidence in the completeness of theAbstract: Objective: One promise of nationwide adoption of electronic health records (EHRs) is the availability of data for large-scale clinical research studies. However, because the same patient could be treated at multiple health care institutions, data from only a single site might not contain the complete medical history for that patient, meaning that critical events could be missing. In this study, we evaluate how simple heuristic checks for data "completeness" affect the number of patients in the resulting cohort and introduce potential biases. Materials and Methods: We began with a set of 16 filters that check for the presence of demographics, laboratory tests, and other types of data, and then systematically applied all 2 16 possible combinations of these filters to the EHR data for 12 million patients at 7 health care systems and a separate payor claims database of 7 million members. Results: EHR data showed considerable variability in data completeness across sites and high correlation between data types. For example, the fraction of patients with diagnoses increased from 35.0% in all patients to 90.9% in those with at least 1 medication. An unrelated claims dataset independently showed that most filters select members who are older and more likely female and can eliminate large portions of the population whose data are actually complete. Discussion and Conclusion: As investigators design studies, they need to balance their confidence in the completeness of the data with the effects of placing requirements on the data on the resulting patient cohort. … (more)
- Is Part Of:
- Journal of the American Medical Informatics Association. Volume 24:Number 6(2017:Nov.)
- Journal:
- Journal of the American Medical Informatics Association
- Issue:
- Volume 24:Number 6(2017:Nov.)
- Issue Display:
- Volume 24, Issue 6 (2017)
- Year:
- 2017
- Volume:
- 24
- Issue:
- 6
- Issue Sort Value:
- 2017-0024-0006-0000
- Page Start:
- 1134
- Page End:
- 1141
- Publication Date:
- 2017-08-01
- Subjects:
- electronic health records -- claims data -- data accuracy -- information storage and retrieval -- selection bias
Medical informatics -- Periodicals
Information Services -- Periodicals
Medical Informatics -- Periodicals
Médecine -- Informatique -- Périodiques
Informatica
Geneeskunde
Informatique médicale
Computer network resources
Electronic journals
610.285 - Journal URLs:
- http://jamia.bmj.com/ ↗
http://www.jamia.org ↗
http://www.pubmedcentral.nih.gov/tocrender.fcgi?journal=76 ↗
http://www.sciencedirect.com/science/journal/10675027 ↗
http://jamia.oxfordjournals.org/ ↗
http://www.oxfordjournals.org/en/ ↗ - DOI:
- 10.1093/jamia/ocx071 ↗
- Languages:
- English
- ISSNs:
- 1067-5027
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4689.025000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 15171.xml