Accurate consistency-based MSA reducing the memory footprint. (September 2021)
- Record Type:
- Journal Article
- Title:
- Accurate consistency-based MSA reducing the memory footprint. (September 2021)
- Main Title:
- Accurate consistency-based MSA reducing the memory footprint
- Authors:
- Lladós, Jordi
Cores, Fernando
Guirado, Fernando
Lérida, Josep L. - Abstract:
- Highlights: Consistency-based methods require high computational resources to run on a workstation. The accuracy of the alignment depends mostly on the constraints of the consistency library. Library reductions enhance the scalability of consistency-based methods. Combining diverse multiple sequence alignment algorithms can improve the final alignment accuracy. Matrix-based T-Coffee allows much more sequences to be aligned with the same memory footprint. Abstract: Background and Objective: The emergence of Next-Generation sequencing has created a push for faster and more accurate multiple sequence alignment tools. The growing number of sequences and their longer sizes, which require the use of increased system resources and produce less accurate results, are heavily challenging to these applications. Consistency-based methods have the most intensive CPU and memory usage requirements. We hypothesize that library reductions can enhance the scalability and performance of consistency-based multiple sequence alignment tools; however, we have previously shown a noticeable impact on the accuracy when extreme reductions were performed. Methods: In this study, we propose Matrix-Based T-Coffee, a consistency-based method that uses library reductions in conjunction with a complementary objective function. The proposed method, implemented in T-Coffee, can mitigate the accuracy loss caused by low memory resources. Results: The use of a complementary objective function with a libraryHighlights: Consistency-based methods require high computational resources to run on a workstation. The accuracy of the alignment depends mostly on the constraints of the consistency library. Library reductions enhance the scalability of consistency-based methods. Combining diverse multiple sequence alignment algorithms can improve the final alignment accuracy. Matrix-based T-Coffee allows much more sequences to be aligned with the same memory footprint. Abstract: Background and Objective: The emergence of Next-Generation sequencing has created a push for faster and more accurate multiple sequence alignment tools. The growing number of sequences and their longer sizes, which require the use of increased system resources and produce less accurate results, are heavily challenging to these applications. Consistency-based methods have the most intensive CPU and memory usage requirements. We hypothesize that library reductions can enhance the scalability and performance of consistency-based multiple sequence alignment tools; however, we have previously shown a noticeable impact on the accuracy when extreme reductions were performed. Methods: In this study, we propose Matrix-Based T-Coffee, a consistency-based method that uses library reductions in conjunction with a complementary objective function. The proposed method, implemented in T-Coffee, can mitigate the accuracy loss caused by low memory resources. Results: The use of a complementary objective function with a library reduction of ≥ 30% improved the accuracy of T-Coffee. Interestingly, ≥ 50% library reduction achieved lower execution times and better overall scalability. Conclusions: Matrix-Based T-Coffee benefits from accurate alignments while achieving better scalability. This leads to a reduction in memory footprint and execution time. In addition, these enhancements could be applied to other aligners based on consistency. … (more)
- Is Part Of:
- Computer methods and programs in biomedicine. Volume 208(2021)
- Journal:
- Computer methods and programs in biomedicine
- Issue:
- Volume 208(2021)
- Issue Display:
- Volume 208, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 208
- Issue:
- 2021
- Issue Sort Value:
- 2021-0208-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-09
- Subjects:
- Multiple sequence alignment -- Consistency -- T-coffee -- Dynamic programming
Medicine -- Computer programs -- Periodicals
Biology -- Computer programs -- Periodicals
Computers -- Periodicals
Medicine -- Periodicals
Médecine -- Logiciels -- Périodiques
Biologie -- Logiciels -- Périodiques
Biology -- Computer programs
Medicine -- Computer programs
Periodicals
Electronic journals
610.28 - Journal URLs:
- http://www.sciencedirect.com/science/journal/01692607 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.cmpb.2021.106237 ↗
- Languages:
- English
- ISSNs:
- 0169-2607
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.095000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 18468.xml