SWIRL: High-performance many-core CPU code generation for deep neural networks. (November 2019)

Record Type:: Journal Article
Title:: SWIRL: High-performance many-core CPU code generation for deep neural networks. (November 2019)
Main Title:: SWIRL: High-performance many-core CPU code generation for deep neural networks
Authors:: Venkat, Anand
Rusira, Tharindu
Barik, Raj
Hall, Mary
Truong, Leonard
Other Names:: Dongarra Jack guest-editor.
Tourancheau Bernard guest-editor.
Abstract:: Deep neural networks (DNNs) have demonstrated effectiveness in many domains including object recognition, speech recognition, natural language processing, and health care. Typically, the computations involved in DNN training and inferencing are time consuming and require efficient implementations. Existing frameworks such as TensorFlow, Theano, Torch, Cognitive Tool Kit (CNTK), and Caffe enable Graphics Processing Unit (GPUs) as the status quo devices for DNN execution, leaving Central Processing Unit (CPUs) behind. Moreover, existing frameworks forgo or limit cross layer optimization opportunities that have the potential to improve performance by significantly reducing data movement through the memory hierarchy. In this article, we describe an alternative approach called SWIRL, a compiler that provides high-performance CPU implementations for DNNs. SWIRL is built on top of the existing domain-specific language (DSL) for DNNs called LATTE . SWIRL separates DNN specification and its schedule using predefined transformation recipes for tensors and layers commonly found in DNN layers. These recipes synergize with DSL constructs to generate high-quality fused, vectorized, and parallelized code for CPUs. On an Intel Xeon Platinum 8180M CPU, SWIRL achieves performance comparable with Tensorflow integrated with MKL-DNN; on average 1.00× of Tensorflow inference and 0.99× of Tensorflow training. It also outperforms the original LATTE compiler on average by 1.22× and 1.30× on … (more)
Is Part Of:: International journal of high performance computing applications. Volume 33:Number 6(2019)
Journal:: International journal of high performance computing applications
Issue:: Volume 33:Number 6(2019)
Issue Display:: Volume 33, Issue 6 (2019)
Year:: 2019
Volume:: 33
Issue:: 6
Issue Sort Value:: 2019-0033-0006-0000
Page Start:: 1275
Page End:: 1289
Publication Date:: 2019-11
Subjects:: Compilers -- code generation -- optimization -- deep neural networks -- code transformations
High performance computing -- Periodicals
Supercomputers -- Periodicals
004.1105
Journal URLs:: http://hpc.sagepub.com ↗
http://www.uk.sagepub.com/home.nav ↗
http://firstsearch.oclc.org ↗
DOI:: 10.1177/1094342019866247 ↗
Languages:: English
ISSNs:: 1094-3420
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 11258.xml