De novo assembly of the Indian blue peacock (Pavo cristatus) genome using Oxford Nanopore technology and Illumina sequencing. Issue 5 (11th May 2019)
- Record Type:
- Journal Article
- Title:
- De novo assembly of the Indian blue peacock (Pavo cristatus) genome using Oxford Nanopore technology and Illumina sequencing. Issue 5 (11th May 2019)
- Main Title:
- De novo assembly of the Indian blue peacock (Pavo cristatus) genome using Oxford Nanopore technology and Illumina sequencing
- Authors:
- Dhar, Ruby
Seethy, Ashikh
Pethusamy, Karthikeyan
Singh, Sunil
Rohil, Vishwajeet
Purkayastha, Kakali
Mukherjee, Indrani
Goswami, Sandeep
Singh, Rakesh
Raj, Ankita
Srivastava, Tryambak
Acharya, Sovon
Rajashekhar, Balaji
Karmakar, Subhradip - Abstract:
- Abstract: Background: The Indian peafowl ( Pavo cristanus ) is native to South Asia and is the national bird of India. Here we present a draft genome sequence of the male blue peacock using Illumina and Oxford Nanopore technology (ONT). Results: ONT sequencing gave ∼2.3-fold sequencing coverage, whereas Illumina generated 150–base pair paired-end sequence data at 284.6-fold coverage from 5 libraries. Subsequently, we generated a 0.915-gigabase pair de novo assembly of the peacock genome with a scaffold N50 of 0.23 megabase pairs (Mb). We predict that the peacock genome contains 23, 153 protein-coding genes and 75.3 Mb (7.33%) of repetitive sequences. Conclusions: We report a high-quality assembly of the peacock genome using a hybrid approach of sequences generated by both Illumina and ONT. The long-read chemistry generated by ONT was useful for addressing challenges related to de novo assembly, particularly at regions containing repetitive sequences spanning longer than the read length, and which could not be resolved with only short-read–based assembly. Contig assembly of Illumina short reads gave an N50 of 1, 639 bases, whereas with ONT, the N50 increased by >9-fold to 14, 749 bases. The initial contig assembly based on Illumina sequencing reads alone gave 685, 241 contigs. Further scaffolding on assembled contigs using both Illumina and ONT sequencing reads resulted in a final assembly of 15, 025 super-scaffolds, with an N50 of ∼0.23 Mb. Ninety-five percent of proteinsAbstract: Background: The Indian peafowl ( Pavo cristanus ) is native to South Asia and is the national bird of India. Here we present a draft genome sequence of the male blue peacock using Illumina and Oxford Nanopore technology (ONT). Results: ONT sequencing gave ∼2.3-fold sequencing coverage, whereas Illumina generated 150–base pair paired-end sequence data at 284.6-fold coverage from 5 libraries. Subsequently, we generated a 0.915-gigabase pair de novo assembly of the peacock genome with a scaffold N50 of 0.23 megabase pairs (Mb). We predict that the peacock genome contains 23, 153 protein-coding genes and 75.3 Mb (7.33%) of repetitive sequences. Conclusions: We report a high-quality assembly of the peacock genome using a hybrid approach of sequences generated by both Illumina and ONT. The long-read chemistry generated by ONT was useful for addressing challenges related to de novo assembly, particularly at regions containing repetitive sequences spanning longer than the read length, and which could not be resolved with only short-read–based assembly. Contig assembly of Illumina short reads gave an N50 of 1, 639 bases, whereas with ONT, the N50 increased by >9-fold to 14, 749 bases. The initial contig assembly based on Illumina sequencing reads alone gave 685, 241 contigs. Further scaffolding on assembled contigs using both Illumina and ONT sequencing reads resulted in a final assembly of 15, 025 super-scaffolds, with an N50 of ∼0.23 Mb. Ninety-five percent of proteins predicted by homology matched with those in a public repository, verifying the completeness of our assembly. Like other phylogenetic studies of avian conserved genes, we found P. cristatus to be most closely related to Gallus gallus, followed by Meleagris gallopavo and Anas platyrhynchos . Compared with the recently published peacock genome assembly, the current, superior, hybrid assembly has greater sequencing depth, fewer non-ATGC sequences, and fewer scaffolds. … (more)
- Is Part Of:
- GigaScience. Volume 8:Issue 5(2019)
- Journal:
- GigaScience
- Issue:
- Volume 8:Issue 5(2019)
- Issue Display:
- Volume 8, Issue 5 (2019)
- Year:
- 2019
- Volume:
- 8
- Issue:
- 5
- Issue Sort Value:
- 2019-0008-0005-0000
- Page Start:
- Page End:
- Publication Date:
- 2019-05-11
- Subjects:
- peacock -- Pavo cristatus -- Indian national bird -- genome assembly -- Oxford Nanopore
Information storage and retrieval systems -- Research -- Periodicals
Biology -- Research -- Periodicals
Medical sciences -- Research -- Periodicals
Database management -- Periodicals
570.285 - Journal URLs:
- http://www.gigasciencejournal.com/ ↗
http://www.oxfordjournals.org/ ↗ - DOI:
- 10.1093/gigascience/giz038 ↗
- Languages:
- English
- ISSNs:
- 2047-217X
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 11993.xml