MARK-AGE data management: Cleaning, exploration and visualization of data. (November 2015)
- Record Type:
- Journal Article
- Title:
- MARK-AGE data management: Cleaning, exploration and visualization of data. (November 2015)
- Main Title:
- MARK-AGE data management: Cleaning, exploration and visualization of data
- Authors:
- Baur, Jennifer
Moreno-Villanueva, Maria
Kötter, Tobias
Sindlinger, Thilo
Bürkle, Alexander
Berthold, Michael R.
Junk, Michael - Abstract:
- Highlights: MARK-AGE was European study to identify a powerful set of biomarkers of human aging. A large body data has been collected in the MARK-AGE database. We present the methods applied for dealing with errors in the database. We describe the detection of missing values, outliers and batch effects. We describe our tools for data exploration and data sharing within the consortium. Abstract: Databases are an organized collection of data and necessary to investigate a wide spectrum of research questions. For data evaluation analyzers should be aware of possible data quality problems that can compromise results validity. Therefore data cleaning is an essential part of the data management process, which deals with the identification and correction of errors in order to improve data quality. In our cross-sectional study, biomarkers of ageing, analytical, anthropometric and demographic data from about 3000 volunteers have been collected in the MARK-AGE database. Although several preventive strategies were applied before data entry, errors like miscoding, missing values, batch problems etc., could not be avoided completely. Such errors can result in misleading information and affect the validity of the performed data analysis. Here we present an overview of the methods we applied for dealing with errors in the MARK-AGE database. We especially describe our strategies for the detection of missing values, outliers and batch effects and explain how they can be handled to improveHighlights: MARK-AGE was European study to identify a powerful set of biomarkers of human aging. A large body data has been collected in the MARK-AGE database. We present the methods applied for dealing with errors in the database. We describe the detection of missing values, outliers and batch effects. We describe our tools for data exploration and data sharing within the consortium. Abstract: Databases are an organized collection of data and necessary to investigate a wide spectrum of research questions. For data evaluation analyzers should be aware of possible data quality problems that can compromise results validity. Therefore data cleaning is an essential part of the data management process, which deals with the identification and correction of errors in order to improve data quality. In our cross-sectional study, biomarkers of ageing, analytical, anthropometric and demographic data from about 3000 volunteers have been collected in the MARK-AGE database. Although several preventive strategies were applied before data entry, errors like miscoding, missing values, batch problems etc., could not be avoided completely. Such errors can result in misleading information and affect the validity of the performed data analysis. Here we present an overview of the methods we applied for dealing with errors in the MARK-AGE database. We especially describe our strategies for the detection of missing values, outliers and batch effects and explain how they can be handled to improve data quality. Finally we report about the tools used for data exploration and data sharing between MARK-AGE collaborators. … (more)
- Is Part Of:
- Mechanisms of ageing and development. Volume 151(2015)
- Journal:
- Mechanisms of ageing and development
- Issue:
- Volume 151(2015)
- Issue Display:
- Volume 151, Issue 2015 (2015)
- Year:
- 2015
- Volume:
- 151
- Issue:
- 2015
- Issue Sort Value:
- 2015-0151-2015-0000
- Page Start:
- 38
- Page End:
- 44
- Publication Date:
- 2015-11
- Subjects:
- Data cleaning -- Missing data -- Batch effects -- Outliers -- Data visualization
Aging -- Periodicals
Developmental biology -- Periodicals
Aging -- Periodicals
Developmental Biology -- Periodicals
Vieillissement -- Périodiques
Biologie du développement -- Périodiques
Aging
Developmental biology
Periodicals
612.67 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00476374 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.mad.2015.05.007 ↗
- Languages:
- English
- ISSNs:
- 0047-6374
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 5424.571000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 424.xml