Correcting gene expression data when neither the unwanted variation nor the factor of interest are observed. (17th August 2015)
- Record Type:
- Journal Article
- Title:
- Correcting gene expression data when neither the unwanted variation nor the factor of interest are observed. (17th August 2015)
- Main Title:
- Correcting gene expression data when neither the unwanted variation nor the factor of interest are observed
- Authors:
- Jacob, Laurent
Gagnon-Bartsch, Johann A.
Speed, Terence P. - Abstract:
- Abstract: When dealing with large scale gene expression studies, observations are commonly contaminated by sources of unwanted variation such as platforms or batches. Not taking this unwanted variation into account when analyzing the data can lead to spurious associations and to missing important signals. When the analysis is unsupervised, e.g. when the goal is to cluster the samples or to build a corrected version of the dataset—as opposed to the study of an observed factor of interest—taking unwanted variation into account can become a difficult task. The factors driving unwanted variation may be correlated with the unobserved factor of interest, so that correcting for the former can remove the latter if not done carefully. We show how negative control genes and replicate samples can be used to estimate unwanted variation in gene expression, and discuss how this information can be used to correct the expression data. The proposed methods are then evaluated on synthetic data and three gene expression datasets. They generally manage to remove unwanted variation without losing the signal of interest and compare favorably to state-of-the-art corrections. All proposed methods are implemented in the bioconductor package RUVnormalize .
- Is Part Of:
- Biostatistics. Volume 17:Number 1(2016:Jan.)
- Journal:
- Biostatistics
- Issue:
- Volume 17:Number 1(2016:Jan.)
- Issue Display:
- Volume 17, Issue 1 (2016)
- Year:
- 2016
- Volume:
- 17
- Issue:
- 1
- Issue Sort Value:
- 2016-0017-0001-0000
- Page Start:
- 16
- Page End:
- 28
- Publication Date:
- 2015-08-17
- Subjects:
- Batch effect -- Control genes -- Gene expression -- Normalization -- Replicate samples
Medical statistics -- Periodicals
Biometry -- Periodicals
Health risk assessment -- Periodicals
Medicine -- Research -- Statistical methods -- Periodicals
610.727 - Journal URLs:
- http://www3.oup.co.uk/biosts ↗
http://ukcatalogue.oup.com/ ↗ - DOI:
- 10.1093/biostatistics/kxv026 ↗
- Languages:
- English
- ISSNs:
- 1465-4644
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 2089.628000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 14248.xml