The parallelism motifs of genomic data analysis. (6th March 2020)
- Record Type:
- Journal Article
- Title:
- The parallelism motifs of genomic data analysis. (6th March 2020)
- Main Title:
- The parallelism motifs of genomic data analysis
- Authors:
- Yelick, Katherine
Buluç, Aydın
Awan, Muaaz
Azad, Ariful
Brock, Benjamin
Egan, Rob
Ekanayake, Saliya
Ellis, Marquita
Georganas, Evangelos
Guidi, Giulia
Hofmeyr, Steven
Selvitopi, Oguz
Teodoropol, Cristina
Oliker, Leonid - Abstract:
- Abstract : Genomic datasets are growing dramatically as the cost of sequencing continues to decline and small sequencing devices become available. Enormous community databases store and share these data with the research community, but some of these genomic data analysis problems require large-scale computational platforms to meet both the memory and computational requirements. These applications differ from scientific simulations that dominate the workload on high-end parallel systems today and place different requirements on programming support, software libraries and parallel architectural design. For example, they involve irregular communication patterns such as asynchronous updates to shared data structures. We consider several problems in high-performance genomics analysis, including alignment, profiling, clustering and assembly for both single genomes and metagenomes. We identify some of the common computational patterns or 'motifs' that help inform parallelization strategies and compare our motifs to some of the established lists, arguing that at least two key patterns, sorting and hashing, are missing. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'.
- Is Part Of:
- Philosophical transactions. Volume 378:Number 2166(2020)
- Journal:
- Philosophical transactions
- Issue:
- Volume 378:Number 2166(2020)
- Issue Display:
- Volume 378, Issue 2166 (2020)
- Year:
- 2020
- Volume:
- 378
- Issue:
- 2166
- Issue Sort Value:
- 2020-0378-2166-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-03-06
- Subjects:
- high-performance data analytics -- bioinformatics -- parallel computing
Physical sciences -- Periodicals
Engineering -- Periodicals
Mathematics -- Periodicals
500 - Journal URLs:
- https://royalsocietypublishing.org/loi/rsta ↗
- DOI:
- 10.1098/rsta.2019.0394 ↗
- Languages:
- English
- ISSNs:
- 1364-503X
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library STI - ELD Digital store
- Ingest File:
- 25074.xml