Application health monitoring for extreme‐scale resiliency using cooperative fault management. (25th July 2019)