PLANiTS: a curated sequence reference dataset for plant ITS DNA metabarcoding. (4th February 2020)
- Record Type:
- Journal Article
- Title:
- PLANiTS: a curated sequence reference dataset for plant ITS DNA metabarcoding. (4th February 2020)
- Main Title:
- PLANiTS: a curated sequence reference dataset for plant ITS DNA metabarcoding
- Authors:
- Banchi, Elisa
Ametrano, Claudio G
Greco, Samuele
Stanković, David
Muggia, Lucia
Pallavicini, Alberto - Abstract:
- Abstract: DNA metabarcoding combines DNA barcoding with high-throughput sequencing to identify different taxa within environmental communities. The ITS has already been proposed and widely used as universal barcode marker for plants, but a comprehensive, updated and accurate reference dataset of plant ITS sequences has not been available so far. Here, we constructed reference datasets of Viridiplantae ITS1, ITS2 and entire ITS sequences including both Chlorophyta and Streptophyta. The sequences were retrieved from NCBI, and the ITS region was extracted. The sequences underwent identity check to remove misidentified records and were clustered at 99% identity to reduce redundancy and computational effort. For this step, we developed a script called 'better clustering for QIIME' (bc4q) to ensure that the representative sequences are chosen according to the composition of the cluster at a different taxonomic level. The three datasets obtained with the bc4q script are PLANiTS1 (100 224 sequences), PLANiTS2 (96 771 sequences) and PLANiTS (97 550 sequences), and all are pre-formatted for QIIME, being this the most used bioinformatic pipeline for metabarcoding analysis. Being curated and updated reference databases, PLANiTS1, PLANiTS2 and PLANiTS are proposed as a reliable, pivotal first step for a general standardization of plant DNA metabarcoding studies. The bc4q script is presented as a new tool useful in each research dealing with sequences clustering. Database URL:Abstract: DNA metabarcoding combines DNA barcoding with high-throughput sequencing to identify different taxa within environmental communities. The ITS has already been proposed and widely used as universal barcode marker for plants, but a comprehensive, updated and accurate reference dataset of plant ITS sequences has not been available so far. Here, we constructed reference datasets of Viridiplantae ITS1, ITS2 and entire ITS sequences including both Chlorophyta and Streptophyta. The sequences were retrieved from NCBI, and the ITS region was extracted. The sequences underwent identity check to remove misidentified records and were clustered at 99% identity to reduce redundancy and computational effort. For this step, we developed a script called 'better clustering for QIIME' (bc4q) to ensure that the representative sequences are chosen according to the composition of the cluster at a different taxonomic level. The three datasets obtained with the bc4q script are PLANiTS1 (100 224 sequences), PLANiTS2 (96 771 sequences) and PLANiTS (97 550 sequences), and all are pre-formatted for QIIME, being this the most used bioinformatic pipeline for metabarcoding analysis. Being curated and updated reference databases, PLANiTS1, PLANiTS2 and PLANiTS are proposed as a reliable, pivotal first step for a general standardization of plant DNA metabarcoding studies. The bc4q script is presented as a new tool useful in each research dealing with sequences clustering. Database URL: https://github.com/apallavicini/bc4q ; https://github.com/apallavicini/PLANiTS . … (more)
- Is Part Of:
- Database. Volume 2020(2020)
- Journal:
- Database
- Issue:
- Volume 2020(2020)
- Issue Display:
- Volume 2020, Issue 2020 (2020)
- Year:
- 2020
- Volume:
- 2020
- Issue:
- 2020
- Issue Sort Value:
- 2020-2020-2020-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-02-04
- Subjects:
- Biology -- Databases -- Periodicals
Bioinformatics -- Periodicals
570.285 - Journal URLs:
- http://database.oxfordjournals.org/ ↗
http://ukcatalogue.oup.com/ ↗ - DOI:
- 10.1093/database/baz155 ↗
- Languages:
- English
- ISSNs:
- 1758-0463
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 26038.xml