What can millions of laboratory test results tell us about the temporal aspect of data quality? Study of data spanning 17 years in a clinical data warehouse. (November 2019)
- Record Type:
- Journal Article
- Title:
- What can millions of laboratory test results tell us about the temporal aspect of data quality? Study of data spanning 17 years in a clinical data warehouse. (November 2019)
- Main Title:
- What can millions of laboratory test results tell us about the temporal aspect of data quality? Study of data spanning 17 years in a clinical data warehouse
- Authors:
- Looten, Vincent
Kong Win Chang, Liliane
Neuraz, Antoine
Landau-Loriot, Marie-Anne
Vedie, Benoit
Paul, Jean-Louis
Mauge, Laëtitia
Rivet, Nadia
Bonifati, Angela
Chatellier, Gilles
Burgun, Anita
Rance, Bastien - Abstract:
- Highlights: Data quality is crucial for secondary use. Laboratory data are exposed to non-biological influences causing changes in their distributions. We propose a nomenclature of temporal patterns present in laboratory data collected from clinical data warehouses. We provide an automated method to detect temporal patterns in laboratory data. Temporal patterns can have an impact of secondary use, we explore solutions for data contextualization. Temporal patterns can have an impact of secondary use (including on statistical analysis), we explore solutions for data contextualization. Abstract: Objective: To identify common temporal evolution profiles in biological data and propose a semi-automated method to these patterns in a clinical data warehouse (CDW). Materials and Methods: We leveraged the CDW of the European Hospital Georges Pompidou and tracked the evolution of 192 biological parameters over a period of 17 years (for 445, 000 + patients, and 131 million laboratory test results). Results: We identified three common profiles of evolution: discretization, breakpoints, and trends. We developed computational and statistical methods to identify these profiles in the CDW. Overall, of the 192 observed biological parameters (87, 814, 136 values), 135 presented at least one evolution. We identified breakpoints in 30 distinct parameters, discretizations in 32, and trends in 79. Discussion and conclusion: our method allowed the identification of several temporal events in theHighlights: Data quality is crucial for secondary use. Laboratory data are exposed to non-biological influences causing changes in their distributions. We propose a nomenclature of temporal patterns present in laboratory data collected from clinical data warehouses. We provide an automated method to detect temporal patterns in laboratory data. Temporal patterns can have an impact of secondary use, we explore solutions for data contextualization. Temporal patterns can have an impact of secondary use (including on statistical analysis), we explore solutions for data contextualization. Abstract: Objective: To identify common temporal evolution profiles in biological data and propose a semi-automated method to these patterns in a clinical data warehouse (CDW). Materials and Methods: We leveraged the CDW of the European Hospital Georges Pompidou and tracked the evolution of 192 biological parameters over a period of 17 years (for 445, 000 + patients, and 131 million laboratory test results). Results: We identified three common profiles of evolution: discretization, breakpoints, and trends. We developed computational and statistical methods to identify these profiles in the CDW. Overall, of the 192 observed biological parameters (87, 814, 136 values), 135 presented at least one evolution. We identified breakpoints in 30 distinct parameters, discretizations in 32, and trends in 79. Discussion and conclusion: our method allowed the identification of several temporal events in the data. Considering the distribution over time of these events, we identified probable causes for the observed profiles: instruments or software upgrades and changes in computation formulas. We evaluated the potential impact for data reuse. Finally, we formulated recommendations to enable safe use and sharing of biological data collection to limit the impact of data evolution in retrospective and federated studies (e.g. the annotation of laboratory parameters presenting breakpoints or trends). … (more)
- Is Part Of:
- Computer methods and programs in biomedicine. Volume 181(2020)
- Journal:
- Computer methods and programs in biomedicine
- Issue:
- Volume 181(2020)
- Issue Display:
- Volume 181, Issue 2020 (2020)
- Year:
- 2020
- Volume:
- 181
- Issue:
- 2020
- Issue Sort Value:
- 2020-0181-2020-0000
- Page Start:
- Page End:
- Publication Date:
- 2019-11
- Subjects:
- Quality control -- Computational biology/methods* -- Information storage and retrieval -- Humans -- Clinical laboratory information systems
Medicine -- Computer programs -- Periodicals
Biology -- Computer programs -- Periodicals
Computers -- Periodicals
Medicine -- Periodicals
Médecine -- Logiciels -- Périodiques
Biologie -- Logiciels -- Périodiques
Biology -- Computer programs
Medicine -- Computer programs
Periodicals
Electronic journals
610.28 - Journal URLs:
- http://www.sciencedirect.com/science/journal/01692607 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.cmpb.2018.12.030 ↗
- Languages:
- English
- ISSNs:
- 0169-2607
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.095000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12168.xml