Improvement of the banana "Musa acuminata" reference sequence using NGS data and semi-automated bioinformatics methods. (December 2016)
- Record Type:
- Journal Article
- Title:
- Improvement of the banana "Musa acuminata" reference sequence using NGS data and semi-automated bioinformatics methods. (December 2016)
- Main Title:
- Improvement of the banana "Musa acuminata" reference sequence using NGS data and semi-automated bioinformatics methods
- Authors:
- Martin, Guillaume
Baurens, Franc-Christophe
Droc, Gaëtan
Rouard, Mathieu
Cenci, Alberto
Kilian, Andrzej
Hastie, Alex
Doležel, Jaroslav
Aury, Jean-Marc
Alberti, Adriana
Carreel, Françoise
D'Hont, Angélique - Abstract:
- Abstract Background Recent advances in genomics indicate functional significance of a majority of genome sequences and their long range interactions. As a detailed examination of genome organization and function requires very high quality genome sequence, the objective of this study was to improve reference genome assembly of banana (Musa acuminata ). Results We have developed a modular bioinformatics pipeline to improve genome sequence assemblies, which can handle various types of data. The pipeline comprises several semi-automated tools. However, unlike classical automated tools that are based on global parameters, the semi-automated tools proposed an expert mode for a user who can decide on suggested improvements through local compromises. The pipeline was used to improve the draft genome sequence ofMusa acuminata. Genotyping by sequencing (GBS) of a segregating population and paired-end sequencing were used to detect and correct scaffold misassemblies. Long insert size paired-end reads identified scaffold junctions and fusions missed by automated assembly methods. GBS markers were used to anchor scaffolds to pseudo-molecules with a new bioinformatics approach that avoids the tedious step of marker ordering during genetic map construction. Furthermore, a genome map was constructed and used to assemble scaffolds into super scaffolds. Finally, a consensus gene annotation was projected on the new assembly from two pre-existing annotations. This approach reduced the totalMusaAbstract Background Recent advances in genomics indicate functional significance of a majority of genome sequences and their long range interactions. As a detailed examination of genome organization and function requires very high quality genome sequence, the objective of this study was to improve reference genome assembly of banana (Musa acuminata ). Results We have developed a modular bioinformatics pipeline to improve genome sequence assemblies, which can handle various types of data. The pipeline comprises several semi-automated tools. However, unlike classical automated tools that are based on global parameters, the semi-automated tools proposed an expert mode for a user who can decide on suggested improvements through local compromises. The pipeline was used to improve the draft genome sequence ofMusa acuminata. Genotyping by sequencing (GBS) of a segregating population and paired-end sequencing were used to detect and correct scaffold misassemblies. Long insert size paired-end reads identified scaffold junctions and fusions missed by automated assembly methods. GBS markers were used to anchor scaffolds to pseudo-molecules with a new bioinformatics approach that avoids the tedious step of marker ordering during genetic map construction. Furthermore, a genome map was constructed and used to assemble scaffolds into super scaffolds. Finally, a consensus gene annotation was projected on the new assembly from two pre-existing annotations. This approach reduced the totalMusa scaffold number from 7513 to 1532 (i.e. by 80 %), with an N50 that increased from 1.3 Mb (65 scaffolds) to 3.0 Mb (26 scaffolds). 89.5 % of the assembly was anchored to the 11Musa chromosomes compared to the previous 70 %. Unknown sites (N) were reduced from 17.3 to 10.0 %. Conclusion The release of theMusa acuminata reference genome version 2 provides a platform for detailed analysis of banana genome variation, function and evolution. Bioinformatics tools developed in this work can be used to improve genome sequence assemblies in other species. … (more)
- Is Part Of:
- BMC genomics. Volume 17:Number 1(2016)
- Journal:
- BMC genomics
- Issue:
- Volume 17:Number 1(2016)
- Issue Display:
- Volume 17, Issue 1 (2016)
- Year:
- 2016
- Volume:
- 17
- Issue:
- 1
- Issue Sort Value:
- 2016-0017-0001-0000
- Page Start:
- 1
- Page End:
- 12
- Publication Date:
- 2016-12
- Subjects:
- Musa acuminata -- Genome assembly -- Bioinformatics tool -- Paired-end sequences -- GBS -- Genome map
Genomes -- Periodicals
Gene mapping -- Periodicals
Genomics -- Periodicals
Base Sequence -- Periodicals
Chromosome Mapping -- Periodicals
Genetic Techniques -- Periodicals
Sequence Analysis, DNA -- Periodicals
572.8605 - Journal URLs:
- http://www.biomedcentral.com/bmcgenomics/ ↗
http://www.pubmedcentral.nih.gov/tocrender.fcgi?journal=32 ↗
http://link.springer.com/ ↗ - DOI:
- 10.1186/s12864-016-2579-4 ↗
- Languages:
- English
- ISSNs:
- 1471-2164
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 9851.xml