A distributed kernel summation framework for general‐dimension machine learning. (24th October 2013)
- Record Type:
- Journal Article
- Title:
- A distributed kernel summation framework for general‐dimension machine learning. (24th October 2013)
- Main Title:
- A distributed kernel summation framework for general‐dimension machine learning
- Authors:
- Lee, Dongryeol
Sao, Piyush
Vuduc, Richard
Gray, Alexander G. - Abstract:
- <abstract abstract-type="main" xml:lang="en"> <title>Abstract</title> <p>Kernel summations are a ubiquitous key computational bottleneck in many data analysis methods. In this paper, we attempt to marry, for the first time, the best relevant techniques in parallel computing, where kernel summations are in low dimensions, with the best general‐dimension algorithms from the machine learning literature. We provide the first distributed implementation of kernel summation framework that can utilize: (i) various types of deterministic and probabilistic approximations that may be suitable for low and high‐dimensional problems with a large number of data points; (ii) any multidimensional binary tree using both distributed memory and shared memory parallelism; and (iii) a dynamic load balancing scheme to adjust work imbalances during the computation. Our hybrid message passing interface (MPI)/OpenMP codebase has wide applicability in providing a general framework to accelerate the computation of many popular machine learning methods. Our experiments show scalability results for kernel density estimation on a synthetic ten‐dimensional dataset containing over one billion points and a subset of the Sloan Digital Sky Survey Data up to 6144 cores. © 2013 Wiley Periodicals, Inc. Statistical Analysis and Data Mining, 2013</p> </abstract>
- Is Part Of:
- Statistical analysis and data mining. Volume 7:Number 1(2014)
- Journal:
- Statistical analysis and data mining
- Issue:
- Volume 7:Number 1(2014)
- Issue Display:
- Volume 7, Issue 1 (2014)
- Year:
- 2014
- Volume:
- 7
- Issue:
- 1
- Issue Sort Value:
- 2014-0007-0001-0000
- Page Start:
- 1
- Page End:
- 13
- Publication Date:
- 2013-10-24
- Subjects:
- Data mining -- Statistical methods -- Periodicals
006.312 - Journal URLs:
- http://www3.interscience.wiley.com/journal/112701062/home ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1002/sam.11207 ↗
- Languages:
- English
- ISSNs:
- 1932-1864
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 8447.424100
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 3305.xml