PASTA with many application-aware optimization criteria for alignment based phylogeny inference. (June 2022)
- Record Type:
- Journal Article
- Title:
- PASTA with many application-aware optimization criteria for alignment based phylogeny inference. (June 2022)
- Main Title:
- PASTA with many application-aware optimization criteria for alignment based phylogeny inference
- Authors:
- Nayeem, Muhammad Ali
Bayzid, Md. Shamsuzzoha
Samudro, Naser Anjum
Rahman, M. Saifur
Rahman, M. Sohel - Abstract:
- Abstract: Multiple sequence alignment (MSA) is a prerequisite for several analyses in bioinformatics, such as, phylogeny estimation, protein structure prediction, etc. PASTA (Practical Alignments using SATé and TrAnsitivity) is a state-of-the-art method for computing MSAs, well-known for its accuracy and scalability. It iteratively co-estimates both MSA and maximum likelihood (ML) phylogenetic tree. It attempts to exploit the close association between the accuracy of an MSA and the corresponding tree while finding the output through multiple iterations from both directions. Currently, PASTA uses the ML score as its optimization criterion which is a good score in phylogeny estimation but cannot be proven as a necessary and sufficient criterion to produce an accurate phylogenetic tree. Therefore, the integration of multiple application-aware objectives into PASTA, which are carefully chosen considering their better association to the tree accuracy, may potentially have a profound positive impact on its performance. This paper has employed four application-aware objectives alongside ML score to develop a multi-objective (MO) framework, namely, PMAO that leverages PASTA to generate a bunch of high-quality solutions that are considered equivalent in the context of conflicting objectives under consideration. our experimental analysis on a popular biological benchmark reveals that the tree-space generated by PMAO contains significantly better trees than stand-alone PASTA. To helpAbstract: Multiple sequence alignment (MSA) is a prerequisite for several analyses in bioinformatics, such as, phylogeny estimation, protein structure prediction, etc. PASTA (Practical Alignments using SATé and TrAnsitivity) is a state-of-the-art method for computing MSAs, well-known for its accuracy and scalability. It iteratively co-estimates both MSA and maximum likelihood (ML) phylogenetic tree. It attempts to exploit the close association between the accuracy of an MSA and the corresponding tree while finding the output through multiple iterations from both directions. Currently, PASTA uses the ML score as its optimization criterion which is a good score in phylogeny estimation but cannot be proven as a necessary and sufficient criterion to produce an accurate phylogenetic tree. Therefore, the integration of multiple application-aware objectives into PASTA, which are carefully chosen considering their better association to the tree accuracy, may potentially have a profound positive impact on its performance. This paper has employed four application-aware objectives alongside ML score to develop a multi-objective (MO) framework, namely, PMAO that leverages PASTA to generate a bunch of high-quality solutions that are considered equivalent in the context of conflicting objectives under consideration. our experimental analysis on a popular biological benchmark reveals that the tree-space generated by PMAO contains significantly better trees than stand-alone PASTA. To help the domain experts further in choosing the most appropriate tree from the PMAO output (containing a relatively large set of high-quality solutions), we have added an additional component within the PMAO framework that is capable of generating a smaller set of high-quality solutions. Finally, we have attempted to obtain a single high-quality solution without using any external evidences and have found that summarizing the few solutions detected through the above component can serve this purpose to some extent. Graphical Abstract: ga1 Highlights: PMAO framework integrates many application-aware objectives into PASTA through multi-objective optimization for better phylogeny estimation. We innovatively employ supervised machine learning as well as some simple criteria within the PMAO framework to assist the domain expert. We experiment with summarizing the PMAO output trees to obtain a single high-quality solution without using any external evidence. … (more)
- Is Part Of:
- Computational biology and chemistry. Volume 98(2022)
- Journal:
- Computational biology and chemistry
- Issue:
- Volume 98(2022)
- Issue Display:
- Volume 98, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 98
- Issue:
- 2022
- Issue Sort Value:
- 2022-0098-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-06
- Subjects:
- Multiple sequence alignment -- Phylogenetic tree -- Multi-objective optimization
Chemistry -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
Biochemistry -- Data processing
Biology -- Data processing
Molecular biology -- Data processing
Periodicals
Electronic journals
542.85 - Journal URLs:
- http://www.sciencedirect.com/science/journal/14769271 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiolchem.2022.107661 ↗
- Languages:
- English
- ISSNs:
- 1476-9271
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.576700
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 21569.xml