PennSeq: accurate isoform-specific gene expression quantification in RNA-Seq by modeling non-uniform read distribution. Issue 3 (20th December 2013)
- Record Type:
- Journal Article
- Title:
- PennSeq: accurate isoform-specific gene expression quantification in RNA-Seq by modeling non-uniform read distribution. Issue 3 (20th December 2013)
- Main Title:
- PennSeq: accurate isoform-specific gene expression quantification in RNA-Seq by modeling non-uniform read distribution
- Authors:
- Hu, Yu
Liu, Yichuan
Mao, Xianyun
Jia, Cheng
Ferguson, Jane F.
Xue, Chenyi
Reilly, Muredach P.
Li, Hongzhe
Li, Mingyao - Abstract:
- Abstract: Correctly estimating isoform-specific gene expression is important for understanding complicated biological mechanisms and for mapping disease susceptibility genes. However, estimating isoform-specific gene expression is challenging because various biases present in RNA-Seq (RNA sequencing) data complicate the analysis, and if not appropriately corrected, can affect isoform expression estimation and downstream analysis. In this article, we present PennSeq, a statistical method that allows each isoform to have its own non-uniform read distribution. Instead of making parametric assumptions, we give adequate weight to the underlying data by the use of a non-parametric approach. Our rationale is that regardless what factors lead to non-uniformity, whether it is due to hexamer priming bias, local sequence bias, positional bias, RNA degradation, mapping bias or other unknown reasons, the probability that a fragment is sampled from a particular region will be reflected in the aligned data. This empirical approach thus maximally reflects the true underlying non-uniform read distribution. We evaluate the performance of PennSeq using both simulated data with known ground truth, and using two real Illumina RNA-Seq data sets including one with quantitative real time polymerase chain reaction measurements. Our results indicate superior performance of PennSeq over existing methods, particularly for isoforms demonstrating severe non-uniformity. PennSeq is freely available forAbstract: Correctly estimating isoform-specific gene expression is important for understanding complicated biological mechanisms and for mapping disease susceptibility genes. However, estimating isoform-specific gene expression is challenging because various biases present in RNA-Seq (RNA sequencing) data complicate the analysis, and if not appropriately corrected, can affect isoform expression estimation and downstream analysis. In this article, we present PennSeq, a statistical method that allows each isoform to have its own non-uniform read distribution. Instead of making parametric assumptions, we give adequate weight to the underlying data by the use of a non-parametric approach. Our rationale is that regardless what factors lead to non-uniformity, whether it is due to hexamer priming bias, local sequence bias, positional bias, RNA degradation, mapping bias or other unknown reasons, the probability that a fragment is sampled from a particular region will be reflected in the aligned data. This empirical approach thus maximally reflects the true underlying non-uniform read distribution. We evaluate the performance of PennSeq using both simulated data with known ground truth, and using two real Illumina RNA-Seq data sets including one with quantitative real time polymerase chain reaction measurements. Our results indicate superior performance of PennSeq over existing methods, particularly for isoforms demonstrating severe non-uniformity. PennSeq is freely available for download at http://sourceforge.net/projects/pennseq . … (more)
- Is Part Of:
- Nucleic acids research. Volume 42:Issue 3(2014)
- Journal:
- Nucleic acids research
- Issue:
- Volume 42:Issue 3(2014)
- Issue Display:
- Volume 42, Issue 3 (2014)
- Year:
- 2014
- Volume:
- 42
- Issue:
- 3
- Issue Sort Value:
- 2014-0042-0003-0000
- Page Start:
- e20
- Page End:
- e20
- Publication Date:
- 2013-12-20
- Subjects:
- Nucleic acids -- Periodicals
Molecular biology -- Periodicals
572.805 - Journal URLs:
- http://nar.oxfordjournals.org/ ↗
http://www.ncbi.nlm.nih.gov/pmc/journals/4 ↗
http://ukcatalogue.oup.com/ ↗
http://firstsearch.oclc.org ↗ - DOI:
- 10.1093/nar/gkt1304 ↗
- Languages:
- English
- ISSNs:
- 0305-1048
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 6183.850000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 22042.xml