A study of the viability of exploiting memory content similarity to improve resilience to memory errors. (February 2015)
- Record Type:
- Journal Article
- Title:
- A study of the viability of exploiting memory content similarity to improve resilience to memory errors. (February 2015)
- Main Title:
- A study of the viability of exploiting memory content similarity to improve resilience to memory errors
- Authors:
- Levy, Scott
Ferreira, Kurt B
Bridges, Patrick G
Thompson, Aidan P
Trott, Christian - Abstract:
- Building the next-generation of extreme-scale distributed systems will require overcoming several challenges related to system resilience. As the number of processors in these systems grow, the failure rate increases proportionally. One of the most common sources of failure in large-scale systems is memory. In this paper, we propose a novel runtime for transparently exploiting memory content similarity to improve system resilience by reducing the rate at which memory errors lead to node failure. We evaluate the viability of this approach by examining memory snapshots collected from eight high-performance computing (HPC) applications and two important HPC operating systems. Based on the characteristics of the similarity uncovered, we conclude that our proposed approach shows promise for addressing system resilience in large-scale systems.
- Is Part Of:
- International journal of high performance computing applications. Volume 29:Number 1(2015:Spring)
- Journal:
- International journal of high performance computing applications
- Issue:
- Volume 29:Number 1(2015:Spring)
- Issue Display:
- Volume 29, Issue 1 (2015)
- Year:
- 2015
- Volume:
- 29
- Issue:
- 1
- Issue Sort Value:
- 2015-0029-0001-0000
- Page Start:
- 5
- Page End:
- 20
- Publication Date:
- 2015-02
- Subjects:
- High-performance computing -- fault tolerance -- resilience
High performance computing -- Periodicals
Supercomputers -- Periodicals
004.1105 - Journal URLs:
- http://hpc.sagepub.com ↗
http://www.uk.sagepub.com/home.nav ↗
http://firstsearch.oclc.org ↗ - DOI:
- 10.1177/1094342014560354 ↗
- Languages:
- English
- ISSNs:
- 1094-3420
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 6239.xml