Recovering single precision accuracy from Tensor Cores while surpassing the FP32 theoretical peak performance. (July 2022)