LINKS: Scalable, alignment-free scaffolding of draft genomes with long reads. Issue 1 (December 2015)
- Record Type:
- Journal Article
- Title:
- LINKS: Scalable, alignment-free scaffolding of draft genomes with long reads. Issue 1 (December 2015)
- Main Title:
- LINKS: Scalable, alignment-free scaffolding of draft genomes with long reads
- Authors:
- Warren, René
Yang, Chen
Vandervalk, Benjamin
Behsaz, Bahar
Lagman, Albert
Jones, Steven
Birol, Inanç - Abstract:
- Abstract Background Owing to the complexity of the assembly problem, we do not yet have complete genome sequences. The difficulty in assembling reads into finished genomes is exacerbated by sequence repeats and the inability of short reads to capture sufficient genomic information to resolve those problematic regions. In this regard, established and emerging long read technologies show great promise, but their current associated higher error rates typically require computational base correction and/or additional bioinformatics pre-processing before they can be of value. Results We present LINKS, the Long Interval NucleotideK- mer Scaffolder algorithm, a method that makes use of the sequence properties of nanopore sequence data and other error-containing sequence data, to scaffold high-quality genome assemblies, without the need for read alignment or base correction. Here, we show how the contiguity of an ABySSEscherichia coli K-12 genome assembly can be increased greater than five-fold by the use of beta-released Oxford Nanopore Technologies Ltd. long reads and how LINKS leverages long-range information inSaccharomyces cerevisiae W303 nanopore reads to yield assemblies whose resulting contiguity and correctness are on par with or better than that of competing applications. We also present the re-scaffolding of the colossal white spruce (Picea glauca ) draft assembly (PG29, 20 Gbp) and demonstrate how LINKS scales to larger genomes. Conclusions This study highlights theAbstract Background Owing to the complexity of the assembly problem, we do not yet have complete genome sequences. The difficulty in assembling reads into finished genomes is exacerbated by sequence repeats and the inability of short reads to capture sufficient genomic information to resolve those problematic regions. In this regard, established and emerging long read technologies show great promise, but their current associated higher error rates typically require computational base correction and/or additional bioinformatics pre-processing before they can be of value. Results We present LINKS, the Long Interval NucleotideK- mer Scaffolder algorithm, a method that makes use of the sequence properties of nanopore sequence data and other error-containing sequence data, to scaffold high-quality genome assemblies, without the need for read alignment or base correction. Here, we show how the contiguity of an ABySSEscherichia coli K-12 genome assembly can be increased greater than five-fold by the use of beta-released Oxford Nanopore Technologies Ltd. long reads and how LINKS leverages long-range information inSaccharomyces cerevisiae W303 nanopore reads to yield assemblies whose resulting contiguity and correctness are on par with or better than that of competing applications. We also present the re-scaffolding of the colossal white spruce (Picea glauca ) draft assembly (PG29, 20 Gbp) and demonstrate how LINKS scales to larger genomes. Conclusions This study highlights the present utility of nanopore reads for genome scaffolding in spite of their current limitations, which are expected to diminish as the nanopore sequencing technology advances. We expect LINKS to have broad utility in harnessing the potential of long reads in connecting high-quality sequences of small and large genome assembly drafts. … (more)
- Is Part Of:
- GigaScience. Volume 4:Issue 1(2015)
- Journal:
- GigaScience
- Issue:
- Volume 4:Issue 1(2015)
- Issue Display:
- Volume 4, Issue 1 (2015)
- Year:
- 2015
- Volume:
- 4
- Issue:
- 1
- Issue Sort Value:
- 2015-0004-0001-0000
- Page Start:
- 1
- Page End:
- 11
- Publication Date:
- 2015-12
- Subjects:
- Nanopore sequencing -- Scaffolding -- Genome assembly -- Next-generation sequencing -- LINKS
Information storage and retrieval systems -- Research -- Periodicals
Biology -- Research -- Periodicals
Medical sciences -- Research -- Periodicals
Database management -- Periodicals
570.285 - Journal URLs:
- http://www.gigasciencejournal.com/ ↗
http://www.oxfordjournals.org/ ↗ - DOI:
- 10.1186/s13742-015-0076-3 ↗
- Languages:
- English
- ISSNs:
- 2047-217X
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 9917.xml