Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Issue 11 (5th November 2021)
- Record Type:
- Journal Article
- Title:
- Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Issue 11 (5th November 2021)
- Main Title:
- Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification
- Authors:
- Schwengers, Oliver
Jelonek, Lukas
Dieckmann, Marius Alfred
Beyvers, Sebastian
Blom, Jochen
Goesmann, Alexander - Abstract:
- Abstract : Command-line annotation software tools have continuously gained popularity compared to centralized online services due to the worldwide increase of sequenced bacterial genomes. However, results of existing command-line software pipelines heavily depend on taxon-specific databases or sufficiently well annotated reference genomes. Here, we introduce Bakta, a new command-line software tool for the robust, taxon-independent, thorough and, nonetheless, fast annotation of bacterial genomes. Bakta conducts a comprehensive annotation workflow including the detection of small proteins taking into account replicon metadata. The annotation of coding sequences is accelerated via an alignment-free sequence identification approach that in addition facilitates the precise assignment of public database cross-references. Annotation results are exported in GFF3 and International Nucleotide Sequence Database Collaboration (INSDC)-compliant flat files, as well as comprehensive JSON files, facilitating automated downstream analysis. We compared Bakta to other rapid contemporary command-line annotation software tools in both targeted and taxonomically broad benchmarks including isolates and metagenomic-assembled genomes. We demonstrated that Bakta outperforms other tools in terms of functional annotations, the assignment of functional categories and database cross-references, whilst providing comparable wall-clock runtimes. Bakta is implemented in Python 3 and runs on MacOS and LinuxAbstract : Command-line annotation software tools have continuously gained popularity compared to centralized online services due to the worldwide increase of sequenced bacterial genomes. However, results of existing command-line software pipelines heavily depend on taxon-specific databases or sufficiently well annotated reference genomes. Here, we introduce Bakta, a new command-line software tool for the robust, taxon-independent, thorough and, nonetheless, fast annotation of bacterial genomes. Bakta conducts a comprehensive annotation workflow including the detection of small proteins taking into account replicon metadata. The annotation of coding sequences is accelerated via an alignment-free sequence identification approach that in addition facilitates the precise assignment of public database cross-references. Annotation results are exported in GFF3 and International Nucleotide Sequence Database Collaboration (INSDC)-compliant flat files, as well as comprehensive JSON files, facilitating automated downstream analysis. We compared Bakta to other rapid contemporary command-line annotation software tools in both targeted and taxonomically broad benchmarks including isolates and metagenomic-assembled genomes. We demonstrated that Bakta outperforms other tools in terms of functional annotations, the assignment of functional categories and database cross-references, whilst providing comparable wall-clock runtimes. Bakta is implemented in Python 3 and runs on MacOS and Linux systems. It is freely available under a GPLv3 license at https://github.com/oschwengers/bakta . An accompanying web version is available at https://bakta.computational.bio . … (more)
- Is Part Of:
- Microbial genomics. Volume 7:Issue 11(2021)
- Journal:
- Microbial genomics
- Issue:
- Volume 7:Issue 11(2021)
- Issue Display:
- Volume 7, Issue 11 (2021)
- Year:
- 2021
- Volume:
- 7
- Issue:
- 11
- Issue Sort Value:
- 2021-0007-0011-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-11-05
- Subjects:
- bacteria -- genome annotation -- metagenome-assembled genomes -- plasmids -- whole-genome sequencing
Microbial genomics -- Periodicals
572.8629 - Journal URLs:
- https://www.microbiologyresearch.org/content/journal/mgen ↗
- DOI:
- 10.1099/mgen.0.000685 ↗
- Languages:
- English
- ISSNs:
- 2057-5858
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library HMNTS - ELD Digital store
- Ingest File:
- 19690.xml