Algorithms for all‐pairs Hamming distance based similarity. (19th April 2021)
- Record Type:
- Journal Article
- Title:
- Algorithms for all‐pairs Hamming distance based similarity. (19th April 2021)
- Main Title:
- Algorithms for all‐pairs Hamming distance based similarity
- Authors:
- Grabowski, Szymon
Kowalski, Tomasz M. - Abstract:
- Abstract: All‐pairs distance computation for a collection of strings is a computation‐intensive task with important applications in bioinformatics, in particular, in distance‐based phylogenetic analysis techniques. Even if the computationally efficient Hamming distance is used for this purpose, the quadratic number of sequence pairs may be challenging. We propose a number of practical algorithms for efficient pairwise Hamming distance computation under a given distance threshold. The techniques are based on such concepts as pivot‐based similarity search in metric spaces, pigeonhole principle for approximate string matching, cache‐friendly data arrangement, bit‐parallelism, and others. We experimentally show that our solutions are often about an order of magnitude faster than the average‐case linear‐time LCP based clusters method proposed recently, both in real and synthetic benchmarks.
- Is Part Of:
- Software, practice & experience. Volume 51:Number 7(2021)
- Journal:
- Software, practice & experience
- Issue:
- Volume 51:Number 7(2021)
- Issue Display:
- Volume 51, Issue 7 (2021)
- Year:
- 2021
- Volume:
- 51
- Issue:
- 7
- Issue Sort Value:
- 2021-0051-0007-0000
- Page Start:
- 1580
- Page End:
- 1590
- Publication Date:
- 2021-04-19
- Subjects:
- algorithms -- computational biology -- Hamming distance -- phylogenetics -- similarity search
Computer software -- Periodicals
Computer programming -- Periodicals
Computer programs -- Periodicals
005.3 - Journal URLs:
- http://onlinelibrary.wiley.com/ ↗
- DOI:
- 10.1002/spe.2978 ↗
- Languages:
- English
- ISSNs:
- 0038-0644
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 8321.453000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 17224.xml