Longer is Not Always Better: Optimizing Barcode Length for Large-Scale Species Discovery and Identification. (17th February 2020)
- Record Type:
- Journal Article
- Title:
- Longer is Not Always Better: Optimizing Barcode Length for Large-Scale Species Discovery and Identification. (17th February 2020)
- Main Title:
- Longer is Not Always Better: Optimizing Barcode Length for Large-Scale Species Discovery and Identification
- Authors:
- Yeo, Darren
Srivathsan, Amrita
Meier, Rudolf - Editors:
- Bond, Jason
- Abstract:
- Abstract: New techniques for the species-level sorting of millions of specimens are needed in order to accelerate species discovery, determine how many species live on earth, and develop efficient biomonitoring techniques. These sorting methods should be reliable, scalable, and cost-effective, as well as being largely insensitive to low-quality genomic DNA, given that this is usually all that can be obtained from museum specimens. Mini-barcodes seem to satisfy these criteria, but it is unclear how well they perform for species-level sorting when compared with full-length barcodes. This is here tested based on 20 empirical data sets covering ca. 30, 000 specimens (5500 species) and six clade-specific data sets from GenBank covering ca. 98, 000 specimens ($>$ 20, 000 species). All specimens in these data sets had full-length barcodes and had been sorted to species-level based on morphology. Mini-barcodes of different lengths and positions were obtained in silico from full-length barcodes using a sliding window approach (three windows: 100 bp, 200 bp, and 300 bp) and by excising nine mini-barcodes with established primers (length: 94–407 bp). We then tested whether barcode length and/or position reduces species-level congruence between morphospecies and molecular operational taxonomic units (mOTUs) that were obtained using three different species delimitation techniques (Poisson Tree Process, Automatic Barcode Gap Discovery, and Objective Clustering). Surprisingly, we find noAbstract: New techniques for the species-level sorting of millions of specimens are needed in order to accelerate species discovery, determine how many species live on earth, and develop efficient biomonitoring techniques. These sorting methods should be reliable, scalable, and cost-effective, as well as being largely insensitive to low-quality genomic DNA, given that this is usually all that can be obtained from museum specimens. Mini-barcodes seem to satisfy these criteria, but it is unclear how well they perform for species-level sorting when compared with full-length barcodes. This is here tested based on 20 empirical data sets covering ca. 30, 000 specimens (5500 species) and six clade-specific data sets from GenBank covering ca. 98, 000 specimens ($>$ 20, 000 species). All specimens in these data sets had full-length barcodes and had been sorted to species-level based on morphology. Mini-barcodes of different lengths and positions were obtained in silico from full-length barcodes using a sliding window approach (three windows: 100 bp, 200 bp, and 300 bp) and by excising nine mini-barcodes with established primers (length: 94–407 bp). We then tested whether barcode length and/or position reduces species-level congruence between morphospecies and molecular operational taxonomic units (mOTUs) that were obtained using three different species delimitation techniques (Poisson Tree Process, Automatic Barcode Gap Discovery, and Objective Clustering). Surprisingly, we find no significant differences in performance for both species- or specimen-level identification between full-length and mini-barcodes as long as they are of moderate length ($>$ 200 bp). Only very short mini-barcodes (<200 bp) perform poorly, especially when they are located near the 5$^\prime$ end of the Folmer region. The mean congruence between morphospecies and mOTUs was ca. 75% for barcodes $>$ 200 bp and the congruent mOTUs contain ca. 75% of all specimens. Most conflict is caused by ca. 10% of the specimens that can be identified and should be targeted for re-examination in order to efficiently resolve conflict. Our study suggests that large-scale species discovery, identification, and metabarcoding can utilize mini-barcodes without any demonstrable loss of information compared to full-length barcodes. [DNA barcoding; metabarcoding; mini-barcodes; species discovery.] … (more)
- Is Part Of:
- Systematic biology. Volume 69:Number 5(2020)
- Journal:
- Systematic biology
- Issue:
- Volume 69:Number 5(2020)
- Issue Display:
- Volume 69, Issue 5 (2020)
- Year:
- 2020
- Volume:
- 69
- Issue:
- 5
- Issue Sort Value:
- 2020-0069-0005-0000
- Page Start:
- 999
- Page End:
- 1015
- Publication Date:
- 2020-02-17
- Subjects:
- Biology -- Classification -- Periodicals
Biology -- Periodicals
Biologie -- Classification -- Périodiques
Biologie -- Périodiques
578.012 - Journal URLs:
- http://ukcatalogue.oup.com/ ↗
- DOI:
- 10.1093/sysbio/syaa014 ↗
- Languages:
- English
- ISSNs:
- 1063-5157
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 8589.180700
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 15299.xml