Fast-SG: an alignment-free algorithm for hybrid assembly. Issue 5 (5th May 2018)
- Record Type:
- Journal Article
- Title:
- Fast-SG: an alignment-free algorithm for hybrid assembly. Issue 5 (5th May 2018)
- Main Title:
- Fast-SG: an alignment-free algorithm for hybrid assembly
- Authors:
- Di Genova, Alex
Ruz, Gonzalo A
Sagot, Marie-France
Maass, Alejandro - Abstract:
- Abstract: Background: Long-read sequencing technologies are the ultimate solution for genome repeats, allowing near reference-level reconstructions of large genomes. However, long-read de novo assembly pipelines are computationally intense and require a considerable amount of coverage, thereby hindering their broad application to the assembly of large genomes. Alternatively, hybrid assembly methods that combine short- and long-read sequencing technologies can reduce the time and cost required to produce de novo assemblies of large genomes. Results: Here, we propose a new method, called Fast -SG, that uses a new ultrafast alignment-free algorithm specifically designed for constructing a scaffolding graph using light-weight data structures. Fast -SG can construct the graph from either short or long reads. This allows the reuse of efficient algorithms designed for short-read data and permits the definition of novel modular hybrid assembly pipelines. Using comprehensive standard datasets and benchmarks, we show how Fast -SG outperforms the state-of-the-art short-read aligners when building the scaffoldinggraph and can be used to extract linking information from either raw or error-corrected long reads. We also show how a hybrid assembly approach using Fast -SG with shallow long-read coverage (5X) and moderate computational resources can produce long-range and accurate reconstructions of the genomes of Arabidopsis thaliana (Ler-0) and human (NA12878). Conclusions: Fast -SG opensAbstract: Background: Long-read sequencing technologies are the ultimate solution for genome repeats, allowing near reference-level reconstructions of large genomes. However, long-read de novo assembly pipelines are computationally intense and require a considerable amount of coverage, thereby hindering their broad application to the assembly of large genomes. Alternatively, hybrid assembly methods that combine short- and long-read sequencing technologies can reduce the time and cost required to produce de novo assemblies of large genomes. Results: Here, we propose a new method, called Fast -SG, that uses a new ultrafast alignment-free algorithm specifically designed for constructing a scaffolding graph using light-weight data structures. Fast -SG can construct the graph from either short or long reads. This allows the reuse of efficient algorithms designed for short-read data and permits the definition of novel modular hybrid assembly pipelines. Using comprehensive standard datasets and benchmarks, we show how Fast -SG outperforms the state-of-the-art short-read aligners when building the scaffoldinggraph and can be used to extract linking information from either raw or error-corrected long reads. We also show how a hybrid assembly approach using Fast -SG with shallow long-read coverage (5X) and moderate computational resources can produce long-range and accurate reconstructions of the genomes of Arabidopsis thaliana (Ler-0) and human (NA12878). Conclusions: Fast -SG opens a door to achieve accurate hybrid long-range reconstructions of large genomes with low effort, high portability, and low cost. … (more)
- Is Part Of:
- GigaScience. Volume 7:Issue 5(2018)
- Journal:
- GigaScience
- Issue:
- Volume 7:Issue 5(2018)
- Issue Display:
- Volume 7, Issue 5 (2018)
- Year:
- 2018
- Volume:
- 7
- Issue:
- 5
- Issue Sort Value:
- 2018-0007-0005-0000
- Page Start:
- Page End:
- Publication Date:
- 2018-05-05
- Subjects:
- hybrid assembly -- genome scaffolding -- alignment-free
Information storage and retrieval systems -- Research -- Periodicals
Biology -- Research -- Periodicals
Medical sciences -- Research -- Periodicals
Database management -- Periodicals
570.285 - Journal URLs:
- http://www.gigasciencejournal.com/ ↗
http://www.oxfordjournals.org/ ↗ - DOI:
- 10.1093/gigascience/giy048 ↗
- Languages:
- English
- ISSNs:
- 2047-217X
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12216.xml