Optimization of quasi-diagonal matrix–vector multiplication on GPU. (May 2014)