A role-changeable fault-tolerant management strategy towards resilient NoC-based manycore systems. Issue 12 (December 2015)
- Record Type:
- Journal Article
- Title:
- A role-changeable fault-tolerant management strategy towards resilient NoC-based manycore systems. Issue 12 (December 2015)
- Main Title:
- A role-changeable fault-tolerant management strategy towards resilient NoC-based manycore systems
- Authors:
- Wu, Zixu
Fu, Fangfa
Lu, Yu
Wang, Jinxiang - Abstract:
- Abstract: In hierarchically managed network-on-chip (NoC) based manycore systems with over one thousand processor cores, resource management is critical to efficient operation of the whole system. Meanwhile, the fault-tolerant design of the management structure is as important as the fault-tolerant design in the micro-structural level and the system level, which should require more attention towards constructing resilient manycore systems. This paper presents RCFTM, a hierarchical agent-based role-changeable fault-tolerant management strategy based on the modified role-changeable management framework for resilient NoC-based manycore systems. It is a distributed and adaptive strategy which can reconstruct the management hierarchy dynamically and it is capable of providing an intrinsic ability to tolerate various damages to the management hierarchy due to permanent core faults. The fault-tolerant capability of the RCFTM strategy is first evaluated with MATLAB for a large scale manycore system. Results show that the RCFTM strategy has better fault-tolerant capability than the conventional replacement strategy which relies only on backup cores. Then, the RCFTM strategy is implemented in C on a 5×5 NoC-based manycore system, where a full-system simulation platform is utilized. Experiments show that, with 20 K bytes ROM and 35.6 K bytes RAM footprint for the RCFTM strategy on each agent, the system can successfully tolerate both single and multiple core faults. Results also showAbstract: In hierarchically managed network-on-chip (NoC) based manycore systems with over one thousand processor cores, resource management is critical to efficient operation of the whole system. Meanwhile, the fault-tolerant design of the management structure is as important as the fault-tolerant design in the micro-structural level and the system level, which should require more attention towards constructing resilient manycore systems. This paper presents RCFTM, a hierarchical agent-based role-changeable fault-tolerant management strategy based on the modified role-changeable management framework for resilient NoC-based manycore systems. It is a distributed and adaptive strategy which can reconstruct the management hierarchy dynamically and it is capable of providing an intrinsic ability to tolerate various damages to the management hierarchy due to permanent core faults. The fault-tolerant capability of the RCFTM strategy is first evaluated with MATLAB for a large scale manycore system. Results show that the RCFTM strategy has better fault-tolerant capability than the conventional replacement strategy which relies only on backup cores. Then, the RCFTM strategy is implemented in C on a 5×5 NoC-based manycore system, where a full-system simulation platform is utilized. Experiments show that, with 20 K bytes ROM and 35.6 K bytes RAM footprint for the RCFTM strategy on each agent, the system can successfully tolerate both single and multiple core faults. Results also show that the RCFTM strategy only introduces less than 1.48% computing overhead of a working agent while the system is fault-free. … (more)
- Is Part Of:
- Microelectronics journal. Volume 46:Issue 12 Part A (2015)
- Journal:
- Microelectronics journal
- Issue:
- Volume 46:Issue 12 Part A (2015)
- Issue Display:
- Volume 46, Issue 12 (2015)
- Year:
- 2015
- Volume:
- 46
- Issue:
- 12
- Issue Sort Value:
- 2015-0046-0012-0000
- Page Start:
- 1371
- Page End:
- 1379
- Publication Date:
- 2015-12
- Subjects:
- Fault-tolerant -- Management strategy -- Resilience -- Network-on-chip -- Manycore system
Microelectronics -- Periodicals
Microélectronique -- Périodiques
Microelectronics
Electronic journals
Journals - contents and abstracts
Periodicals
621.3805 - Journal URLs:
- http://catalog.hathitrust.org/api/volumes/oclc/5877621.html ↗
http://www.sciencedirect.com/science/journal/00262692 ↗
http://www.intute.ac.uk/sciences/cgi-bin/fullrecord.pl?handle=lesa.1012319367 ↗
http://www.elsevier.com/journals ↗
http://www.elsevier.com/homepage/elecserv.htt ↗ - DOI:
- 10.1016/j.mejo.2015.09.010 ↗
- Languages:
- English
- ISSNs:
- 0959-8324
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 5758.973000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 1751.xml