Assisted transcriptome reconstruction and splicing orthology. (November 2016)
- Record Type:
- Journal Article
- Title:
- Assisted transcriptome reconstruction and splicing orthology. (November 2016)
- Main Title:
- Assisted transcriptome reconstruction and splicing orthology
- Authors:
- Blanquart, Samuel
Varré, Jean-Stéphane
Guertin, Paul
Perrin, Amandine
Bergeron, Anne
Swenson, Krister - Abstract:
- Abstract Background Transcriptome reconstruction, defined as the identification of all protein isoforms that may be expressed by a gene, is a notably difficult computational task. With real data, the best methods based on RNA-seq data identify barely 21 % of the expressed transcripts. While waiting for algorithms and sequencing techniques to improve — as has been strongly suggested in the literature — it is important to evaluateassisted transcriptome prediction ; this is the question of how alternative transcription in one species performs as a predictor of protein isoforms in another relatively close species. Most evidence-based gene predictors use transcripts from other species to annotate a genome, but the predictive power of procedures that use exclusively transcripts from external species has never been quantified. The cornerstone of such an evaluation is the correct identification of pairs of transcripts with the same splicing patterns, calledsplicing orthologs . Results We propose a rigorous procedural definition of splicing orthologs, based on the identification of all ortholog pairs of splicing sites in the nucleotide sequences, and alignments at the protein level. Using our definition, we compared 24 382 human transcripts and 17 909 mouse transcripts from the highly curated CCDS database, and identified 11 122 splicing orthologs. In prediction mode, we show that human transcripts can be used to infer over 62 % of mouse protein isoforms. When restricting theAbstract Background Transcriptome reconstruction, defined as the identification of all protein isoforms that may be expressed by a gene, is a notably difficult computational task. With real data, the best methods based on RNA-seq data identify barely 21 % of the expressed transcripts. While waiting for algorithms and sequencing techniques to improve — as has been strongly suggested in the literature — it is important to evaluateassisted transcriptome prediction ; this is the question of how alternative transcription in one species performs as a predictor of protein isoforms in another relatively close species. Most evidence-based gene predictors use transcripts from other species to annotate a genome, but the predictive power of procedures that use exclusively transcripts from external species has never been quantified. The cornerstone of such an evaluation is the correct identification of pairs of transcripts with the same splicing patterns, calledsplicing orthologs . Results We propose a rigorous procedural definition of splicing orthologs, based on the identification of all ortholog pairs of splicing sites in the nucleotide sequences, and alignments at the protein level. Using our definition, we compared 24 382 human transcripts and 17 909 mouse transcripts from the highly curated CCDS database, and identified 11 122 splicing orthologs. In prediction mode, we show that human transcripts can be used to infer over 62 % of mouse protein isoforms. When restricting the predictions to transcripts known eight years ago, the percentage grows to 74 %. Using CCDS timestamped releases, we also analyze the evolution of the number of splicing orthologs over the last decade. Conclusions Alternative splicing is now recognized to play a major role in the protein diversity of eukaryotic organisms, but definitions of spliced isoform orthologs are still approximate. Here we propose a definition adapted to the subtle variations of conserved alternative splicing sites, and use it to validate numerous accurate orthologous isoform predictions. … (more)
- Is Part Of:
- BMC genomics. Volume 17:Number 10(2016)
- Journal:
- BMC genomics
- Issue:
- Volume 17:Number 10(2016)
- Issue Display:
- Volume 17, Issue 10 (2016)
- Year:
- 2016
- Volume:
- 17
- Issue:
- 10
- Issue Sort Value:
- 2016-0017-0010-0000
- Page Start:
- 157
- Page End:
- 164
- Publication Date:
- 2016-11
- Subjects:
- Transcriptome prediction -- Splicing orthologs -- Eukaryotes
Genomes -- Periodicals
Gene mapping -- Periodicals
Genomics -- Periodicals
Base Sequence -- Periodicals
Chromosome Mapping -- Periodicals
Genetic Techniques -- Periodicals
Sequence Analysis, DNA -- Periodicals
572.8605 - Journal URLs:
- http://www.biomedcentral.com/bmcgenomics/ ↗
http://www.pubmedcentral.nih.gov/tocrender.fcgi?journal=32 ↗
http://link.springer.com/ ↗ - DOI:
- 10.1186/s12864-016-3103-6 ↗
- Languages:
- English
- ISSNs:
- 1471-2164
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 10045.xml