Cross-validation to select Bayesian hierarchical models in phylogenetics. Issue 1 (December 2016)
- Record Type:
- Journal Article
- Title:
- Cross-validation to select Bayesian hierarchical models in phylogenetics. Issue 1 (December 2016)
- Main Title:
- Cross-validation to select Bayesian hierarchical models in phylogenetics
- Authors:
- Duchêne, Sebastián
Duchêne, David
Di Giallonardo, Francesca
Eden, John-Sebastian
Geoghegan, Jemma
Holt, Kathryn
Ho, Simon
Holmes, Edward - Abstract:
- Abstract Background Recent developments in Bayesian phylogenetic models have increased the range of inferences that can be drawn from molecular sequence data. Accordingly, model selection has become an important component of phylogenetic analysis. Methods of model selection generally consider the likelihood of the data under the model in question. In the context of Bayesian phylogenetics, the most common approach involves estimating the marginal likelihood, which is typically done by integrating the likelihood across model parameters, weighted by the prior. Although this method is accurate, it is sensitive to the presence of improper priors. We explored an alternative approach based on cross-validation that is widely used in evolutionary analysis. This involves comparing models according to their predictive performance. Results We analysed simulated data and a range of viral and bacterial data sets using a cross-validation approach to compare a variety of molecular clock and demographic models. Our results show that cross-validation can be effective in distinguishing between strict- and relaxed-clock models and in identifying demographic models that allow growth in population size over time. In most of our empirical data analyses, the model selected using cross-validation was able to match that selected using marginal-likelihood estimation. The accuracy of cross-validation appears to improve with longer sequence data, particularly when distinguishing between relaxed-clockAbstract Background Recent developments in Bayesian phylogenetic models have increased the range of inferences that can be drawn from molecular sequence data. Accordingly, model selection has become an important component of phylogenetic analysis. Methods of model selection generally consider the likelihood of the data under the model in question. In the context of Bayesian phylogenetics, the most common approach involves estimating the marginal likelihood, which is typically done by integrating the likelihood across model parameters, weighted by the prior. Although this method is accurate, it is sensitive to the presence of improper priors. We explored an alternative approach based on cross-validation that is widely used in evolutionary analysis. This involves comparing models according to their predictive performance. Results We analysed simulated data and a range of viral and bacterial data sets using a cross-validation approach to compare a variety of molecular clock and demographic models. Our results show that cross-validation can be effective in distinguishing between strict- and relaxed-clock models and in identifying demographic models that allow growth in population size over time. In most of our empirical data analyses, the model selected using cross-validation was able to match that selected using marginal-likelihood estimation. The accuracy of cross-validation appears to improve with longer sequence data, particularly when distinguishing between relaxed-clock models. Conclusions Cross-validation is a useful method for Bayesian phylogenetic model selection. This method can be readily implemented even when considering complex models where selecting an appropriate prior for all parameters may be difficult. … (more)
- Is Part Of:
- BMC evolutionary biology. Volume 16:Issue 1(2016)
- Journal:
- BMC evolutionary biology
- Issue:
- Volume 16:Issue 1(2016)
- Issue Display:
- Volume 16, Issue 1 (2016)
- Year:
- 2016
- Volume:
- 16
- Issue:
- 1
- Issue Sort Value:
- 2016-0016-0001-0000
- Page Start:
- 1
- Page End:
- 8
- Publication Date:
- 2016-12
- Subjects:
- Model selection -- Cross-validation -- Bayesian phylogenetics -- Molecular clock -- Demographic models -- Marginal likelihood
Evolution (Biology) -- Periodicals
576.805 - Journal URLs:
- http://www.biomedcentral.com/bmcevolbiol/ ↗
http://www.pubmedcentral.nih.gov/tocrender.fcgi?journal=28 ↗
http://link.springer.com/ ↗ - DOI:
- 10.1186/s12862-016-0688-y ↗
- Languages:
- English
- ISSNs:
- 1471-2148
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 9984.xml