A longitudinal analysis of data quality in a large pediatric data research network. (8th April 2017)
- Record Type:
- Journal Article
- Title:
- A longitudinal analysis of data quality in a large pediatric data research network. (8th April 2017)
- Main Title:
- A longitudinal analysis of data quality in a large pediatric data research network
- Authors:
- Khare, Ritu
Utidjian, Levon
Ruth, Byron J
Kahn, Michael G
Burrows, Evanette
Marsolo, Keith
Patibandla, Nandan
Razzaghi, Hanieh
Colvin, Ryan
Ranade, Daksha
Kitzmiller, Melody
Eckrich, Daniel
Bailey, L Charles - Abstract:
- Abstract: Objective: PEDSnet is a clinical data research network (CDRN) that aggregates electronic health record data from multiple children's hospitals to enable large-scale research. Assessing data quality to ensure suitability for conducting research is a key requirement in PEDSnet. This study presents a range of data quality issues identified over a period of 18 months and interprets them to evaluate the research capacity of PEDSnet. Materials and Methods: Results were generated by a semiautomated data quality assessment workflow. Two investigators reviewed programmatic data quality issues and conducted discussions with the data partners' extract-transform-load analysts to determine the cause for each issue. Results: The results include a longitudinal summary of 2182 data quality issues identified across 9 data submission cycles. The metadata from the most recent cycle includes annotations for 850 issues: most frequent types, including missing data (>300) and outliers (>100); most complex domains, including medications (>160) and lab measurements (>140); and primary causes, including source data characteristics (83%) and extract-transform-load errors (9%). Discussion: The longitudinal findings demonstrate the network's evolution from identifying difficulties with aligning the data to a common data model to learning norms in clinical pediatrics and determining research capability. Conclusion: While data quality is recognized as a critical aspect in establishing andAbstract: Objective: PEDSnet is a clinical data research network (CDRN) that aggregates electronic health record data from multiple children's hospitals to enable large-scale research. Assessing data quality to ensure suitability for conducting research is a key requirement in PEDSnet. This study presents a range of data quality issues identified over a period of 18 months and interprets them to evaluate the research capacity of PEDSnet. Materials and Methods: Results were generated by a semiautomated data quality assessment workflow. Two investigators reviewed programmatic data quality issues and conducted discussions with the data partners' extract-transform-load analysts to determine the cause for each issue. Results: The results include a longitudinal summary of 2182 data quality issues identified across 9 data submission cycles. The metadata from the most recent cycle includes annotations for 850 issues: most frequent types, including missing data (>300) and outliers (>100); most complex domains, including medications (>160) and lab measurements (>140); and primary causes, including source data characteristics (83%) and extract-transform-load errors (9%). Discussion: The longitudinal findings demonstrate the network's evolution from identifying difficulties with aligning the data to a common data model to learning norms in clinical pediatrics and determining research capability. Conclusion: While data quality is recognized as a critical aspect in establishing and utilizing a CDRN, the findings from data quality assessments are largely unpublished. This paper presents a real-world account of studying and interpreting data quality findings in a pediatric CDRN, and the lessons learned could be used by other CDRNs. … (more)
- Is Part Of:
- Journal of the American Medical Informatics Association. Volume 24:Number 6(2017:Nov.)
- Journal:
- Journal of the American Medical Informatics Association
- Issue:
- Volume 24:Number 6(2017:Nov.)
- Issue Display:
- Volume 24, Issue 6 (2017)
- Year:
- 2017
- Volume:
- 24
- Issue:
- 6
- Issue Sort Value:
- 2017-0024-0006-0000
- Page Start:
- 1072
- Page End:
- 1079
- Publication Date:
- 2017-04-08
- Subjects:
- CDRN -- data quality -- electronic health record -- extract-transform-load -- secondary use
Medical informatics -- Periodicals
Information Services -- Periodicals
Medical Informatics -- Periodicals
Médecine -- Informatique -- Périodiques
Informatica
Geneeskunde
Informatique médicale
Computer network resources
Electronic journals
610.285 - Journal URLs:
- http://jamia.bmj.com/ ↗
http://www.jamia.org ↗
http://www.pubmedcentral.nih.gov/tocrender.fcgi?journal=76 ↗
http://www.sciencedirect.com/science/journal/10675027 ↗
http://jamia.oxfordjournals.org/ ↗
http://www.oxfordjournals.org/en/ ↗ - DOI:
- 10.1093/jamia/ocx033 ↗
- Languages:
- English
- ISSNs:
- 1067-5027
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4689.025000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 15163.xml