GC bias affects genomic and metagenomic reconstructions, underrepresenting GC-poor organisms. Issue 2 (13th February 2020)
- Record Type:
- Journal Article
- Title:
- GC bias affects genomic and metagenomic reconstructions, underrepresenting GC-poor organisms. Issue 2 (13th February 2020)
- Main Title:
- GC bias affects genomic and metagenomic reconstructions, underrepresenting GC-poor organisms
- Authors:
- Browne, Patrick Denis
Nielsen, Tue Kjærgaard
Kot, Witold
Aggerholm, Anni
Gilbert, M Thomas P
Puetz, Lara
Rasmussen, Morten
Zervas, Athanasios
Hansen, Lars Hestbjerg - Abstract:
- Abstract: Background: Metagenomic sequencing is a well-established tool in the modern biosciences. While it promises unparalleled insights into the genetic content of the biological samples studied, conclusions drawn are at risk from biases inherent to the DNA sequencing methods, including inaccurate abundance estimates as a function of genomic guanine-cytosine (GC) contents. Results: We explored such GC biases across many commonly used platforms in experiments sequencing multiple genomes (with mean GC contents ranging from 28.9% to 62.4%) and metagenomes. GC bias profiles varied among different library preparation protocols and sequencing platforms. We found that our workflows using MiSeq and NextSeq were hindered by major GC biases, with problems becoming increasingly severe outside the 45–65% GC range, leading to a falsely low coverage in GC-rich and especially GC-poor sequences, where genomic windows with 30% GC content had >10-fold less coverage than windows close to 50% GC content. We also showed that GC content correlates tightly with coverage biases. The PacBio and HiSeq platforms also evidenced similar profiles of GC biases to each other, which were distinct from those seen in the MiSeq and NextSeq workflows. The Oxford Nanopore workflow was not afflicted by GC bias. Conclusions: These findings indicate potential sources of difficulty, arising from GC biases, in genome sequencing that could be pre-emptively addressed with methodological optimizations provided thatAbstract: Background: Metagenomic sequencing is a well-established tool in the modern biosciences. While it promises unparalleled insights into the genetic content of the biological samples studied, conclusions drawn are at risk from biases inherent to the DNA sequencing methods, including inaccurate abundance estimates as a function of genomic guanine-cytosine (GC) contents. Results: We explored such GC biases across many commonly used platforms in experiments sequencing multiple genomes (with mean GC contents ranging from 28.9% to 62.4%) and metagenomes. GC bias profiles varied among different library preparation protocols and sequencing platforms. We found that our workflows using MiSeq and NextSeq were hindered by major GC biases, with problems becoming increasingly severe outside the 45–65% GC range, leading to a falsely low coverage in GC-rich and especially GC-poor sequences, where genomic windows with 30% GC content had >10-fold less coverage than windows close to 50% GC content. We also showed that GC content correlates tightly with coverage biases. The PacBio and HiSeq platforms also evidenced similar profiles of GC biases to each other, which were distinct from those seen in the MiSeq and NextSeq workflows. The Oxford Nanopore workflow was not afflicted by GC bias. Conclusions: These findings indicate potential sources of difficulty, arising from GC biases, in genome sequencing that could be pre-emptively addressed with methodological optimizations provided that the GC biases inherent to the relevant workflow are understood. Furthermore, it is recommended that a more critical approach be taken in quantitative abundance estimates in metagenomic studies. In the future, metagenomic studies should take steps to account for the effects of GC bias before drawing conclusions, or they should use a demonstrably unbiased workflow. … (more)
- Is Part Of:
- GigaScience. Volume 9:Issue 2(2020)
- Journal:
- GigaScience
- Issue:
- Volume 9:Issue 2(2020)
- Issue Display:
- Volume 9, Issue 2 (2020)
- Year:
- 2020
- Volume:
- 9
- Issue:
- 2
- Issue Sort Value:
- 2020-0009-0002-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-02-13
- Subjects:
- GC bias -- high-throughput sequencing -- metagenomics -- Illumina -- Oxford Nanopore -- PacBio
Information storage and retrieval systems -- Research -- Periodicals
Biology -- Research -- Periodicals
Medical sciences -- Research -- Periodicals
Database management -- Periodicals
570.285 - Journal URLs:
- http://www.gigasciencejournal.com/ ↗
http://www.oxfordjournals.org/ ↗ - DOI:
- 10.1093/gigascience/giaa008 ↗
- Languages:
- English
- ISSNs:
- 2047-217X
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12906.xml