Orthology Guided Assembly in highly heterozygous crops: creating a reference transcriptome to uncover genetic diversity in Lolium perenne. Issue 5 (21st February 2013)
- Record Type:
- Journal Article
- Title:
- Orthology Guided Assembly in highly heterozygous crops: creating a reference transcriptome to uncover genetic diversity in Lolium perenne. Issue 5 (21st February 2013)
- Main Title:
- Orthology Guided Assembly in highly heterozygous crops: creating a reference transcriptome to uncover genetic diversity in Lolium perenne
- Authors:
- Ruttink, Tom
Sterck, Lieven
Rohde, Antje
Bendixen, Christian
Rouzé, Pierre
Asp, Torben
Van de Peer, Yves
Roldan‐Ruiz, Isabel - Abstract:
- <abstract abstract-type="main" xml:lang="en" id="pbi12051-abs-0001"> <title>Summary</title> <p>Despite current advances in next‐generation sequencing data analysis procedures, <italic>de novo</italic> assembly of a reference sequence required for SNP discovery and expression analysis is still a major challenge in genetically uncharacterized, highly heterozygous species. High levels of polymorphism inherent to outbreeding crop species hamper De Bruijn Graph‐based <italic>de novo</italic> assembly algorithms, causing transcript fragmentation and the redundant assembly of allelic contigs. If multiple genotypes are sequenced to study genetic diversity, primary <italic>de novo</italic> assembly is best performed per genotype to limit the level of polymorphism and avoid transcript fragmentation. Here, we propose an Orthology Guided Assembly procedure that first uses sequence similarity (tBLASTn) to proteins of a model species to select allelic and fragmented contigs from all genotypes and then performs CAP3 clustering on a gene‐by‐gene basis. Thus, we simultaneously annotate putative orthologues for each protein of the model species, resolve allelic redundancy and fragmentation and create a <italic>de novo</italic> transcript sequence representing the consensus of all alleles present in the sequenced genotypes. We demonstrate the procedure using RNA‐seq data from 14 genotypes of <italic>Lolium perenne</italic> to generate a reference transcriptome for gene discovery and<abstract abstract-type="main" xml:lang="en" id="pbi12051-abs-0001"> <title>Summary</title> <p>Despite current advances in next‐generation sequencing data analysis procedures, <italic>de novo</italic> assembly of a reference sequence required for SNP discovery and expression analysis is still a major challenge in genetically uncharacterized, highly heterozygous species. High levels of polymorphism inherent to outbreeding crop species hamper De Bruijn Graph‐based <italic>de novo</italic> assembly algorithms, causing transcript fragmentation and the redundant assembly of allelic contigs. If multiple genotypes are sequenced to study genetic diversity, primary <italic>de novo</italic> assembly is best performed per genotype to limit the level of polymorphism and avoid transcript fragmentation. Here, we propose an Orthology Guided Assembly procedure that first uses sequence similarity (tBLASTn) to proteins of a model species to select allelic and fragmented contigs from all genotypes and then performs CAP3 clustering on a gene‐by‐gene basis. Thus, we simultaneously annotate putative orthologues for each protein of the model species, resolve allelic redundancy and fragmentation and create a <italic>de novo</italic> transcript sequence representing the consensus of all alleles present in the sequenced genotypes. We demonstrate the procedure using RNA‐seq data from 14 genotypes of <italic>Lolium perenne</italic> to generate a reference transcriptome for gene discovery and translational research, to reveal the transcriptome‐wide distribution and density of SNPs in an outbreeding crop and to illustrate the effect of polymorphisms on the assembly procedure. The results presented here illustrate that constructing a non‐redundant reference sequence is essential for comparative genomics, orthology‐based annotation and candidate gene selection but also for read mapping and subsequent polymorphism discovery and/or read count‐based gene expression analysis.</p> </abstract> … (more)
- Is Part Of:
- Plant biotechnology journal. Volume 11:Issue 5(2013:Jun.)
- Journal:
- Plant biotechnology journal
- Issue:
- Volume 11:Issue 5(2013:Jun.)
- Issue Display:
- Volume 11, Issue 5 (2013)
- Year:
- 2013
- Volume:
- 11
- Issue:
- 5
- Issue Sort Value:
- 2013-0011-0005-0000
- Page Start:
- 605
- Page End:
- 617
- Publication Date:
- 2013-02-21
- Subjects:
- Plant biotechnology -- Periodicals
Plant genetic engineering -- Periodicals
630.272 - Journal URLs:
- http://onlinelibrary.wiley.com/journal/10.1111/(ISSN)1467-7652 ↗
http://www.blackwell-synergy.com/servlet/useragent?func=showIssues&code=pbi ↗
http://www.blackwellpublishing.com/journal.asp?ref=1467-7644 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1111/pbi.12051 ↗
- Languages:
- English
- ISSNs:
- 1467-7644
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 6513.780000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 3947.xml