Phylogenetic Clustering by Linear Integer Programming (PhyCLIP). (11th March 2019)
- Record Type:
- Journal Article
- Title:
- Phylogenetic Clustering by Linear Integer Programming (PhyCLIP). (11th March 2019)
- Main Title:
- Phylogenetic Clustering by Linear Integer Programming (PhyCLIP)
- Authors:
- Han, Alvin X
Parker, Edyth
Scholer, Frits
Maurer-Stroh, Sebastian
Russell, Colin A - Editors:
- Shapiro, Beth
- Abstract:
- Abstract: Subspecies nomenclature systems of pathogens are increasingly based on sequence data. The use of phylogenetics to identify and differentiate between clusters of genetically similar pathogens is particularly prevalent in virology from the nomenclature of human papillomaviruses to highly pathogenic avian influenza (HPAI) H5Nx viruses. These nomenclature systems rely on absolute genetic distance thresholds to define the maximum genetic divergence tolerated between viruses designated as closely related. However, the phylogenetic clustering methods used in these nomenclature systems are limited by the arbitrariness of setting intra and intercluster diversity thresholds. The lack of a consensus ground truth to define well-delineated, meaningful phylogenetic subpopulations amplifies the difficulties in identifying an informative distance threshold. Consequently, phylogenetic clustering often becomes an exploratory, ad hoc exercise. Phylogenetic Clustering by Linear Integer Programming (PhyCLIP) was developed to provide a statistically principled phylogenetic clustering framework that negates the need for an arbitrarily defined distance threshold. Using the pairwise patristic distance distributions of an input phylogeny, PhyCLIP parameterizes the intra and intercluster divergence limits as statistical bounds in an integer linear programming model which is subsequently optimized to cluster as many sequences as possible. When applied to the hemagglutinin phylogeny of HPAIAbstract: Subspecies nomenclature systems of pathogens are increasingly based on sequence data. The use of phylogenetics to identify and differentiate between clusters of genetically similar pathogens is particularly prevalent in virology from the nomenclature of human papillomaviruses to highly pathogenic avian influenza (HPAI) H5Nx viruses. These nomenclature systems rely on absolute genetic distance thresholds to define the maximum genetic divergence tolerated between viruses designated as closely related. However, the phylogenetic clustering methods used in these nomenclature systems are limited by the arbitrariness of setting intra and intercluster diversity thresholds. The lack of a consensus ground truth to define well-delineated, meaningful phylogenetic subpopulations amplifies the difficulties in identifying an informative distance threshold. Consequently, phylogenetic clustering often becomes an exploratory, ad hoc exercise. Phylogenetic Clustering by Linear Integer Programming (PhyCLIP) was developed to provide a statistically principled phylogenetic clustering framework that negates the need for an arbitrarily defined distance threshold. Using the pairwise patristic distance distributions of an input phylogeny, PhyCLIP parameterizes the intra and intercluster divergence limits as statistical bounds in an integer linear programming model which is subsequently optimized to cluster as many sequences as possible. When applied to the hemagglutinin phylogeny of HPAI H5Nx viruses, PhyCLIP was not only able to recapitulate the current WHO/OIE/FAO H5 nomenclature system but also further delineated informative higher resolution clusters that capture geographically distinct subpopulations of viruses. PhyCLIP is pathogen-agnostic and can be generalized to a wide variety of research questions concerning the identification of biologically informative clusters in pathogen phylogenies. PhyCLIP is freely available at http://github.com/alvinxhan/PhyCLIP, last accessed March 15, 2019. … (more)
- Is Part Of:
- Molecular biology and evolution. Volume 36:Number 7(2019)
- Journal:
- Molecular biology and evolution
- Issue:
- Volume 36:Number 7(2019)
- Issue Display:
- Volume 36, Issue 7 (2019)
- Year:
- 2019
- Volume:
- 36
- Issue:
- 7
- Issue Sort Value:
- 2019-0036-0007-0000
- Page Start:
- 1580
- Page End:
- 1595
- Publication Date:
- 2019-03-11
- Subjects:
- phylogenetic clustering -- molecular epidemiology -- influenza -- pathogen -- nomenclature
Molecular biology -- Periodicals
Molecular evolution -- Periodicals
Evolution, Molecular -- Periodicals
Molecular Biology -- Periodicals
572.8 - Journal URLs:
- http://mbe.oxfordjournals.org/ ↗
http://www.molbiolevol.org/ ↗
http://ukcatalogue.oup.com/ ↗
http://firstsearch.oclc.org ↗
http://firstsearch.oclc.org/journal=0737-7038;screen=info;ECOIP ↗ - DOI:
- 10.1093/molbev/msz053 ↗
- Languages:
- English
- ISSNs:
- 0737-4038
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 5900.782000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12192.xml