A runtime fault survival method for deployed software during production runs. Issue 2 (18th January 2016)
- Record Type:
- Journal Article
- Title:
- A runtime fault survival method for deployed software during production runs. Issue 2 (18th January 2016)
- Main Title:
- A runtime fault survival method for deployed software during production runs
- Authors:
- Seo, Jooyoung
Park, Jihyun
Choi, Byoungju - Abstract:
- Abstract: Runtime memory faults during production run should be more thoroughly addressed because they severely affect system availability. This paper proposes a method for mitigating memory faults during production runs of deployed software, thereby ensuring normal system operation until patches to fix the faults are delivered. Furthermore, the method helps enhance debugging efficiency by providing accurate on‐site fault information used by developers to release timely patches. The core of the method is to offer information tagging to identify runtime faults and a fault survival algorithm to provide differentiated fault mitigation according to the runtime state. We implemented ROPHE on a Linux 2.6 platform and conducted an empirical study of representative Linux applications. The results show that the average fault‐handling rate among the applications is 35.75%, whereas the RemOte runtime Protection for High‐risk Error (ROPHE) greatly improves capacity to an average of 91.94%. Specifically, the fault‐handling rates of the applications ranged widely from 7.32% to 62.96%, while ROPHE provided fault‐survival rates in the relatively narrow range of 82.35–97.44%. The experimental results show that the proposed method guarantees the same level of reliability for all applications regardless of their individual fault handling capacity. Copyright © 2016 John Wiley & Sons, Ltd. Abstract : Runtime memory faults during production run should be more thoroughly addressed because theyAbstract: Runtime memory faults during production run should be more thoroughly addressed because they severely affect system availability. This paper proposes a method for mitigating memory faults during production runs of deployed software, thereby ensuring normal system operation until patches to fix the faults are delivered. Furthermore, the method helps enhance debugging efficiency by providing accurate on‐site fault information used by developers to release timely patches. The core of the method is to offer information tagging to identify runtime faults and a fault survival algorithm to provide differentiated fault mitigation according to the runtime state. We implemented ROPHE on a Linux 2.6 platform and conducted an empirical study of representative Linux applications. The results show that the average fault‐handling rate among the applications is 35.75%, whereas the RemOte runtime Protection for High‐risk Error (ROPHE) greatly improves capacity to an average of 91.94%. Specifically, the fault‐handling rates of the applications ranged widely from 7.32% to 62.96%, while ROPHE provided fault‐survival rates in the relatively narrow range of 82.35–97.44%. The experimental results show that the proposed method guarantees the same level of reliability for all applications regardless of their individual fault handling capacity. Copyright © 2016 John Wiley & Sons, Ltd. Abstract : Runtime memory faults during production run should be more thoroughly addressed because they severely affect system availability. This paper proposes a method for mitigating memory faults during production runs of deployed software, thereby ensuring normal system operation until patches to fix the faults are delivered. Furthermore, the method helps enhance debugging efficiency by providing accurate on‐site fault information used by developers to release timely patches. We implemented ROPHE tool and conducted an empirical study of representative Linux applications. … (more)
- Is Part Of:
- Journal of software. Volume 28:Issue 2(2016)
- Journal:
- Journal of software
- Issue:
- Volume 28:Issue 2(2016)
- Issue Display:
- Volume 28, Issue 2 (2016)
- Year:
- 2016
- Volume:
- 28
- Issue:
- 2
- Issue Sort Value:
- 2016-0028-0002-0000
- Page Start:
- 97
- Page End:
- 119
- Publication Date:
- 2016-01-18
- Subjects:
- deployed software reliability -- fault survival -- fault mitigation -- runtime memory fault
Software engineering -- Periodicals
Computer software -- Development -- Periodicals
Software maintenance -- Periodicals
005.1 - Journal URLs:
- http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)2047-7481 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1002/smr.1767 ↗
- Languages:
- English
- ISSNs:
- 2047-7473
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 1556.xml