Enrichment sampling for a multi-site patient survey using electronic health records and census data. (27th December 2018)
- Record Type:
- Journal Article
- Title:
- Enrichment sampling for a multi-site patient survey using electronic health records and census data. (27th December 2018)
- Main Title:
- Enrichment sampling for a multi-site patient survey using electronic health records and census data
- Authors:
- Mercaldo, Nathaniel D
Brothers, Kyle B
Carrell, David S
Clayton, Ellen W
Connolly, John J
Holm, Ingrid A
Horowitz, Carol R
Jarvik, Gail P
Kitchner, Terrie E
Li, Rongling
McCarty, Catherine A
McCormick, Jennifer B
McManus, Valerie D
Myers, Melanie F
Pankratz, Joshua J
Shrubsole, Martha J
Smith, Maureen E
Stallings, Sarah C
Williams, Janet L
Schildcrout, Jonathan S - Abstract:
- Abstract: Objective: We describe a stratified sampling design that combines electronic health records (EHRs) and United States Census (USC) data to construct the sampling frame and an algorithm to enrich the sample with individuals belonging to rarer strata. Materials and Methods: This design was developed for a multi-site survey that sought to examine patient concerns about and barriers to participating in research studies, especially among under-studied populations (eg, minorities, low educational attainment). We defined sampling strata by cross-tabulating several socio-demographic variables obtained from EHR and augmented with census-block-level USC data. We oversampled rarer and historically underrepresented subpopulations. Results: The sampling strategy, which included USC-supplemented EHR data, led to a far more diverse sample than would have been expected under random sampling (eg, 3-, 8-, 7-, and 12-fold increase in African Americans, Asians, Hispanics and those with less than a high school degree, respectively). We observed that our EHR data tended to misclassify minority races more often than majority races, and that non-majority races, Latino ethnicity, younger adult age, lower education, and urban/suburban living were each associated with lower response rates to the mailed surveys. Discussion: We observed substantial enrichment from rarer subpopulations. The magnitude of the enrichment depends on the accuracy of the variables that define the sampling strata andAbstract: Objective: We describe a stratified sampling design that combines electronic health records (EHRs) and United States Census (USC) data to construct the sampling frame and an algorithm to enrich the sample with individuals belonging to rarer strata. Materials and Methods: This design was developed for a multi-site survey that sought to examine patient concerns about and barriers to participating in research studies, especially among under-studied populations (eg, minorities, low educational attainment). We defined sampling strata by cross-tabulating several socio-demographic variables obtained from EHR and augmented with census-block-level USC data. We oversampled rarer and historically underrepresented subpopulations. Results: The sampling strategy, which included USC-supplemented EHR data, led to a far more diverse sample than would have been expected under random sampling (eg, 3-, 8-, 7-, and 12-fold increase in African Americans, Asians, Hispanics and those with less than a high school degree, respectively). We observed that our EHR data tended to misclassify minority races more often than majority races, and that non-majority races, Latino ethnicity, younger adult age, lower education, and urban/suburban living were each associated with lower response rates to the mailed surveys. Discussion: We observed substantial enrichment from rarer subpopulations. The magnitude of the enrichment depends on the accuracy of the variables that define the sampling strata and the overall response rate. Conclusion: EHR and USC data may be used to define sampling strata that in turn may be used to enrich the final study sample. This design may be of particular interest for studies of rarer and understudied populations. … (more)
- Is Part Of:
- Journal of the American Medical Informatics Association. Volume 26:Number 3(2019)
- Journal:
- Journal of the American Medical Informatics Association
- Issue:
- Volume 26:Number 3(2019)
- Issue Display:
- Volume 26, Issue 3 (2019)
- Year:
- 2019
- Volume:
- 26
- Issue:
- 3
- Issue Sort Value:
- 2019-0026-0003-0000
- Page Start:
- 219
- Page End:
- 227
- Publication Date:
- 2018-12-27
- Subjects:
- enrichment sampling -- electronic health records -- census data
Medical informatics -- Periodicals
Information Services -- Periodicals
Medical Informatics -- Periodicals
Médecine -- Informatique -- Périodiques
Informatica
Geneeskunde
Informatique médicale
Computer network resources
Electronic journals
610.285 - Journal URLs:
- http://jamia.bmj.com/ ↗
http://www.jamia.org ↗
http://www.pubmedcentral.nih.gov/tocrender.fcgi?journal=76 ↗
http://www.sciencedirect.com/science/journal/10675027 ↗
http://jamia.oxfordjournals.org/ ↗
http://www.oxfordjournals.org/en/ ↗ - DOI:
- 10.1093/jamia/ocy164 ↗
- Languages:
- English
- ISSNs:
- 1067-5027
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4689.025000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 15444.xml