A novel fast multiple nucleotide sequence alignment method based on FM-index. Issue 1 (10th December 2021)
- Record Type:
- Journal Article
- Title:
- A novel fast multiple nucleotide sequence alignment method based on FM-index. Issue 1 (10th December 2021)
- Main Title:
- A novel fast multiple nucleotide sequence alignment method based on FM-index
- Authors:
- Liu, Huan
Zou, Quan
Xu, Yun - Abstract:
- Abstract: Multiple sequence alignment (MSA) is fundamental to many biological applications. But most classical MSA algorithms are difficult to handle large-scale multiple sequences, especially long sequences. Therefore, some recent aligners adopt an efficient divide-and-conquer strategy to divide long sequences into several short sub-sequences. Selecting the common segments (i.e. anchors) for division of sequences is very critical as it directly affects the accuracy and time cost. So, we proposed a novel algorithm, FMAlign, to improve the performance of multiple nucleotide sequence alignment. We use FM-index to extract long common segments at a low cost rather than using a space-consuming hash table. Moreover, after finding the longer optimal common segments, the sequences are divided by the longer common segments. FMAlign has been tested on virus and bacteria genome and human mitochondrial genome datasets, and compared with existing MSA methods such as MAFFT, HAlign and FAME. The experiments show that our method outperforms the existing methods in terms of running time, and has a high accuracy on long sequence sets. All the results demonstrate that our method is applicable to the large-scale nucleotide sequences in terms of sequence length and sequence number. The source code and related data are accessible in https://github.com/iliuh/FMAlign .
- Is Part Of:
- Briefings in bioinformatics. Volume 23:Issue 1(2022)
- Journal:
- Briefings in bioinformatics
- Issue:
- Volume 23:Issue 1(2022)
- Issue Display:
- Volume 23, Issue 1 (2022)
- Year:
- 2022
- Volume:
- 23
- Issue:
- 1
- Issue Sort Value:
- 2022-0023-0001-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-12-10
- Subjects:
- multiple sequence alignment -- divide-and-conquer strategy -- FM-index -- common segments -- dividing sequences
Genetics -- Data processing -- Periodicals
Molecular biology -- Data processing -- Periodicals
Genomes -- Data processing -- Periodicals
572.80285 - Journal URLs:
- http://bib.oxfordjournals.org ↗
http://www.oxfordjournals.org/content?genre=journal&issn=1477-4054 ↗
http://ukcatalogue.oup.com/ ↗
http://firstsearch.oclc.org ↗ - DOI:
- 10.1093/bib/bbab519 ↗
- Languages:
- English
- ISSNs:
- 1467-5463
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 2283.958363
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20639.xml