An efficient implementation of kernel density estimation for multi-core and many-core architectures. (August 2015)