A data‐centric framework for debugging highly parallel applications. (21st November 2013)
- Record Type:
- Journal Article
- Title:
- A data‐centric framework for debugging highly parallel applications. (21st November 2013)
- Main Title:
- A data‐centric framework for debugging highly parallel applications
- Authors:
- Dinh, Minh Ngoc
Abramson, David
Jin, Chao
Gontarek, Andrew
Moench, Bob
DeRose, Luiz - Abstract:
- <abstract abstract-type="main"> <title>Summary</title> <p>Contemporary parallel debuggers allow users to control more than one processing thread while supporting the same examination and visualisation operations of that of sequential debuggers. This approach restricts the use of parallel debuggers when it comes to large scale scientific applications run across hundreds of thousands compute cores. First, manually observing the runtime data to detect error becomes impractical because the data is too big. Second, performing expensive but useful debugging operations becomes infeasible as the computational codes become more complex, involving larger data structures, and as the machines become larger. This study explores the idea of a data‐centric debugging approach, which could be used to make parallel debuggers more powerful. It discusses the use of <italic>ad hoc</italic> debug‐time assertions that allow a user to reason about the state of a parallel computation. These assertions support the verification and validation of program state at runtime as a whole rather than focusing on that of only a single process state. Furthermore, the debugger's performance can be improved by exploiting the underlying parallel platform because the available compute cores can execute parallel debugging functions, while a program is idling at a breakpoint. We demonstrate the system with several case studies and evaluate the performance of the tool on a 20 000 cores Cray XE6. Copyright © 2013 John<abstract abstract-type="main"> <title>Summary</title> <p>Contemporary parallel debuggers allow users to control more than one processing thread while supporting the same examination and visualisation operations of that of sequential debuggers. This approach restricts the use of parallel debuggers when it comes to large scale scientific applications run across hundreds of thousands compute cores. First, manually observing the runtime data to detect error becomes impractical because the data is too big. Second, performing expensive but useful debugging operations becomes infeasible as the computational codes become more complex, involving larger data structures, and as the machines become larger. This study explores the idea of a data‐centric debugging approach, which could be used to make parallel debuggers more powerful. It discusses the use of <italic>ad hoc</italic> debug‐time assertions that allow a user to reason about the state of a parallel computation. These assertions support the verification and validation of program state at runtime as a whole rather than focusing on that of only a single process state. Furthermore, the debugger's performance can be improved by exploiting the underlying parallel platform because the available compute cores can execute parallel debugging functions, while a program is idling at a breakpoint. We demonstrate the system with several case studies and evaluate the performance of the tool on a 20 000 cores Cray XE6. Copyright © 2013 John Wiley &amp; Sons, Ltd.</p> </abstract> … (more)
- Is Part Of:
- Software, practice & experience. Volume 45:Number 4(2015)
- Journal:
- Software, practice & experience
- Issue:
- Volume 45:Number 4(2015)
- Issue Display:
- Volume 45, Issue 4 (2015)
- Year:
- 2015
- Volume:
- 45
- Issue:
- 4
- Issue Sort Value:
- 2015-0045-0004-0000
- Page Start:
- 501
- Page End:
- 526
- Publication Date:
- 2013-11-21
- Subjects:
- Computer software -- Periodicals
Computer programming -- Periodicals
Computer programs -- Periodicals
005.3 - Journal URLs:
- http://onlinelibrary.wiley.com/ ↗
- DOI:
- 10.1002/spe.2239 ↗
- Languages:
- English
- ISSNs:
- 0038-0644
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 8321.453000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 3222.xml