Cleaning of anthropometric data from PCORnet electronic health records using automated algorithms. Issue 4 (2nd November 2022)
- Record Type:
- Journal Article
- Title:
- Cleaning of anthropometric data from PCORnet electronic health records using automated algorithms. Issue 4 (2nd November 2022)
- Main Title:
- Cleaning of anthropometric data from PCORnet electronic health records using automated algorithms
- Authors:
- Lin, Pi-I D
Rifas-Shiman, Sheryl L
Aris, Izzuddin M
Daley, Matthew F
Janicke, David M
Heerman, William J
Chudnov, Daniel L
Freedman, David S
Block, Jason P - Abstract:
- Abstract: Objective: To demonstrate the utility of growthcleanr, an anthropometric data cleaning method designed for electronic health records (EHR). Materials and Methods: We used all available pediatric and adult height and weight data from an ongoing observational study that includes EHR data from 15 healthcare systems and applied growthcleanr to identify outliers and errors and compared its performance in pediatric data with 2 other pediatric data cleaning methods: (1) conditional percentile ( cp ) and (2) PaEdiatric ANthropometric measurement Outlier Flagging pipeline ( peanof ). Results: 687 226 children (<20 years) and 3 267 293 adults contributed 71 246 369 weight and 51 525 487 height measurements. growthcleanr flagged 18% of pediatric and 12% of adult measurements for exclusion, mostly as carried-forward measures for pediatric data and duplicates for adult and pediatric data. After removing the flagged measurements, 0.5% and 0.6% of the pediatric heights and weights and 0.3% and 1.4% of the adult heights and weights, respectively, were biologically implausible according to the CDC and other established cut points. Compared with other pediatric cleaning methods, growthcleanr flagged the most measurements for exclusion; however, it did not flag some more extreme measurements. The prevalence of severe pediatric obesity was 9.0%, 9.2%, and 8.0% after cleaning by growthcleanr, cp, and peanof, respectively. Conclusion: growthcleanr is useful for cleaning pediatric andAbstract: Objective: To demonstrate the utility of growthcleanr, an anthropometric data cleaning method designed for electronic health records (EHR). Materials and Methods: We used all available pediatric and adult height and weight data from an ongoing observational study that includes EHR data from 15 healthcare systems and applied growthcleanr to identify outliers and errors and compared its performance in pediatric data with 2 other pediatric data cleaning methods: (1) conditional percentile ( cp ) and (2) PaEdiatric ANthropometric measurement Outlier Flagging pipeline ( peanof ). Results: 687 226 children (<20 years) and 3 267 293 adults contributed 71 246 369 weight and 51 525 487 height measurements. growthcleanr flagged 18% of pediatric and 12% of adult measurements for exclusion, mostly as carried-forward measures for pediatric data and duplicates for adult and pediatric data. After removing the flagged measurements, 0.5% and 0.6% of the pediatric heights and weights and 0.3% and 1.4% of the adult heights and weights, respectively, were biologically implausible according to the CDC and other established cut points. Compared with other pediatric cleaning methods, growthcleanr flagged the most measurements for exclusion; however, it did not flag some more extreme measurements. The prevalence of severe pediatric obesity was 9.0%, 9.2%, and 8.0% after cleaning by growthcleanr, cp, and peanof, respectively. Conclusion: growthcleanr is useful for cleaning pediatric and adult height and weight data. It is the only method with the ability to clean adult data and identify carried-forward and duplicates, which are prevalent in EHR. Findings of this study can be used to improve the growthcleanr algorithm. Lay Summary: Anthropometric data from electronic health records (EHR) provides straightforward access to large amounts of data for conducting clinical research. However, data errors can potentially bias study results. In this study, we demonstrate the utility of an automated anthropometric data cleaning method— growthcleanr —on more than 12 million pediatric and adult height and weight measurements. Using growthcleanr, we flagged 18% of pediatric and 12% of adult measurements for exclusion, mostly because measures were inappropriately carried forward from prior visits for children or were duplicates for both adults and children. After removing the flagged measurements, 0.5% and 0.6% of the pediatric heights and weights and 0.3% and 1.4% of the adult heights and weights, respectively, were considered biologically implausible according to established cut points. We compared the results of growthcleanr for pediatrics to other existing pediatric data cleaning methods; growthcleanr flagged the most measurements for exclusion but failed to flag some more extreme measurements because growthcleanr has a broad inclusion criterion that can be refined for every study. Our study showed that growthcleanr is useful for cleaning pediatric and adult height and weight data. It is the only method with the ability to clean adult data and identify measures that are carried-forward and duplicated, both of which are common issues with EHR data. Findings of this study can be used to improve the growthcleanr algorithm. … (more)
- Is Part Of:
- JAMIA open. Volume 5:Issue 4(2022)
- Journal:
- JAMIA open
- Issue:
- Volume 5:Issue 4(2022)
- Issue Display:
- Volume 5, Issue 4 (2022)
- Year:
- 2022
- Volume:
- 5
- Issue:
- 4
- Issue Sort Value:
- 2022-0005-0004-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-11-02
- Subjects:
- electronic health record -- automatic data processing -- body height -- body weight -- PCORnet -- big data
Medical informatics -- Periodicals
610.285 - Journal URLs:
- http://www.oxfordjournals.org/ ↗
https://academic.oup.com/jamiaopen ↗ - DOI:
- 10.1093/jamiaopen/ooac089 ↗
- Languages:
- English
- ISSNs:
- 2574-2531
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 24267.xml