Assexon: Assembling Exon Using Gene Capture Data. (September 2019)
- Record Type:
- Journal Article
- Title:
- Assexon: Assembling Exon Using Gene Capture Data. (September 2019)
- Main Title:
- Assexon: Assembling Exon Using Gene Capture Data
- Authors:
- Yuan, Hao
Atta, Calder
Tornabene, Luke
Li, Chenhong - Abstract:
- Exon capture across species has been one of the most broadly applied approaches to acquire multi-locus data in phylogenomic studies of non-model organisms. Methods for assembling loci from short-read sequences (eg, Illumina platforms) that rely on mapping reads to a reference genome may not be suitable for studies comprising species across a wide phylogenetic spectrum; thus, de novo assembling methods are more generally applied. Current approaches for assembling targeted exons from short reads are not particularly optimized as they cannot (1) assemble loci with low read depth, (2) handle large files efficiently, and (3) reliably address issues with paralogs. Thus, we present Assexon: a streamlined pipeline that de novo assembles targeted exons and their flanking sequences from raw reads. We tested our method using reads from Lepisosteus osseus (4.37 Gb) and Boleophthalmus pectinirostris (2.43 Gb), which are captured using baits that were designed based on genome sequence of Lepisosteus oculatus and Oreochromis niloticus, respectively. We compared performance of Assexon to PHYLUCE and HybPiper, which are commonly used pipelines to assemble ultra-conserved element (UCE) and Hyb-seq data. A custom exon capture analysis pipeline (CP) developed by Yuan et al was compared as well. Assexon accurately assembled more than 3400 to 3800 (20%-28%) loci than PHYLUCE and more than 1900 to 2300 (8%-14%) loci than HybPiper across different levels of phylogenetic divergence. Assexon ran atExon capture across species has been one of the most broadly applied approaches to acquire multi-locus data in phylogenomic studies of non-model organisms. Methods for assembling loci from short-read sequences (eg, Illumina platforms) that rely on mapping reads to a reference genome may not be suitable for studies comprising species across a wide phylogenetic spectrum; thus, de novo assembling methods are more generally applied. Current approaches for assembling targeted exons from short reads are not particularly optimized as they cannot (1) assemble loci with low read depth, (2) handle large files efficiently, and (3) reliably address issues with paralogs. Thus, we present Assexon: a streamlined pipeline that de novo assembles targeted exons and their flanking sequences from raw reads. We tested our method using reads from Lepisosteus osseus (4.37 Gb) and Boleophthalmus pectinirostris (2.43 Gb), which are captured using baits that were designed based on genome sequence of Lepisosteus oculatus and Oreochromis niloticus, respectively. We compared performance of Assexon to PHYLUCE and HybPiper, which are commonly used pipelines to assemble ultra-conserved element (UCE) and Hyb-seq data. A custom exon capture analysis pipeline (CP) developed by Yuan et al was compared as well. Assexon accurately assembled more than 3400 to 3800 (20%-28%) loci than PHYLUCE and more than 1900 to 2300 (8%-14%) loci than HybPiper across different levels of phylogenetic divergence. Assexon ran at least twice as fast as PHYLUCE and HybPiper. Number of loci assembled using CP was comparable with Assexon in both tests, while Assexon ran at least 7 times faster than CP. In addition, some steps of CP require the user's interaction and are not fully automated, and this user time was not counted in our calculation. Both Assexon and CP retrieved no paralogs in the testing runs, but PHYLUCE and Hybpiper did. In conclusion, Assexon is a tool for accurate and efficient assembling of large read sets from exon capture experiments. Furthermore, Assexon includes scripts to filter poorly aligned coding regions and flanking regions, calculate summary statistics of loci, and select loci with reliable phylogenetic signal. Assexon is available athttps://github.com/yhadevol/Assexon . … (more)
- Is Part Of:
- Evolutionary bioinformatics online. Volume 15(2019)
- Journal:
- Evolutionary bioinformatics online
- Issue:
- Volume 15(2019)
- Issue Display:
- Volume 15, Issue 2019 (2019)
- Year:
- 2019
- Volume:
- 15
- Issue:
- 2019
- Issue Sort Value:
- 2019-0015-2019-0000
- Page Start:
- Page End:
- Publication Date:
- 2019-09
- Subjects:
- Exon capture -- read assembly -- de novo assembly -- data filtering -- phylogenomics -- hybrid enrichment
Bioinformatics -- Periodicals
Evolutionary computation -- Periodicals
Genetic programming (Computer science) -- Periodicals
Computational Biology
Evolution, Molecular
Bioinformatics
Electronic journals
Periodicals
Fulltext
Internet Resources
Periodicals
Periodicals
576.8 - Journal URLs:
- http://insights.sagepub.com/journal-evolutionary-bioinformatics-j17 ↗
http://www.uk.sagepub.com/home.nav ↗
http://www.la-press.com/evolutionary-bioinformatics-journal-j17 ↗
http://bibpurl.oclc.org/web/38943 ↗ - DOI:
- 10.1177/1176934319874792 ↗
- Languages:
- English
- ISSNs:
- 1176-9343
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12121.xml