Non-Markovian effects on protein sequence evolution due to site dependent substitution rates. Issue 1 (December 2016)
- Record Type:
- Journal Article
- Title:
- Non-Markovian effects on protein sequence evolution due to site dependent substitution rates. Issue 1 (December 2016)
- Main Title:
- Non-Markovian effects on protein sequence evolution due to site dependent substitution rates
- Authors:
- Rizzato, Francesca
Rodriguez, Alex
Laio, Alessandro - Abstract:
- Abstract Background Many models of protein sequence evolution, in particular those based on Point Accepted Mutation (PAM) matrices, assume that its dynamics is Markovian. Nevertheless, it has been observed that evolution seems to proceed differently at different time scales, questioning this assumption. In 2011 Kosiol and Goldman proved that, if evolution is Markovian at the codon level, it can not be Markovian at the amino acid level. However, it remains unclear up to which point the Markov assumption is verified at the codon level. Results Here we show how also the among-site variability of substitution rates makes the process of full protein sequence evolution effectively not Markovian even at the codon level. This may be the theoretical explanation behind the well known systematic underestimation of evolutionary distances observed when omitting rate variability. If the substitution rate variability is neglected the average amino acid and codon replacement probabilities are affected by systematic errors and those with the largest mismatches are the substitutions involving more than one nucleotide at a time. On the other hand, the instantaneous substitution matrices estimated from alignments with the Markov assumption tend to overestimate double and triple substitutions, even when learned from alignments at high sequence identity. Conclusions These results discourage the use of simple Markov models to describe full protein sequence evolution and encourage to employ,Abstract Background Many models of protein sequence evolution, in particular those based on Point Accepted Mutation (PAM) matrices, assume that its dynamics is Markovian. Nevertheless, it has been observed that evolution seems to proceed differently at different time scales, questioning this assumption. In 2011 Kosiol and Goldman proved that, if evolution is Markovian at the codon level, it can not be Markovian at the amino acid level. However, it remains unclear up to which point the Markov assumption is verified at the codon level. Results Here we show how also the among-site variability of substitution rates makes the process of full protein sequence evolution effectively not Markovian even at the codon level. This may be the theoretical explanation behind the well known systematic underestimation of evolutionary distances observed when omitting rate variability. If the substitution rate variability is neglected the average amino acid and codon replacement probabilities are affected by systematic errors and those with the largest mismatches are the substitutions involving more than one nucleotide at a time. On the other hand, the instantaneous substitution matrices estimated from alignments with the Markov assumption tend to overestimate double and triple substitutions, even when learned from alignments at high sequence identity. Conclusions These results discourage the use of simple Markov models to describe full protein sequence evolution and encourage to employ, whenever possible, models that account for rate variability by construction (such as hidden Markov models or mixture models) or substitution models of the type of Le and Gascuel (2008) that account for it explicitly. … (more)
- Is Part Of:
- BMC bioinformatics. Volume 17:Issue 1(2016)
- Journal:
- BMC bioinformatics
- Issue:
- Volume 17:Issue 1(2016)
- Issue Display:
- Volume 17, Issue 1 (2016)
- Year:
- 2016
- Volume:
- 17
- Issue:
- 1
- Issue Sort Value:
- 2016-0017-0001-0000
- Page Start:
- 1
- Page End:
- 9
- Publication Date:
- 2016-12
- Subjects:
- Non-Markovian evolution -- Amino acid substitution matrices -- Substitution rate variability -- Evolutionary distances -- Protein sequence evolution
Bioinformatics -- Periodicals
Computational biology -- Periodicals
570.285 - Journal URLs:
- http://www.biomedcentral.com/bmcbioinformatics/ ↗
http://www.pubmedcentral.nih.gov/tocrender.fcgi?journal=13 ↗
http://link.springer.com/ ↗ - DOI:
- 10.1186/s12859-016-1135-1 ↗
- Languages:
- English
- ISSNs:
- 1471-2105
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - Digital store
British Library HMNTS - ELD Digital store - Ingest File:
- 9952.xml