A new approach for accurate distributed cluster analysis for Big Data: competitive K-Means. (1st January 2014)
- Record Type:
- Journal Article
- Title:
- A new approach for accurate distributed cluster analysis for Big Data: competitive K-Means. (1st January 2014)
- Main Title:
- A new approach for accurate distributed cluster analysis for Big Data: competitive K-Means
- Authors:
- Esteves, Rui Máximo
Hacker, Thomas
Rong, Chunming - Abstract:
- The tremendous growth in data volumes has created a need for new tools and algorithms to quickly analyse large datasets. Cluster analysis techniques, such as K-Means can be distributed across several machines. The accuracy of K-Means depends on the selection of seed centroids during initialisation. K-Means++ improves on the K-Means seeder, but suffers from problems when it is applied to large datasets. In this paper, we describe a new algorithm and a MapReduce implementation we developed that addresses these problems. We compared the performance with three existing algorithms and found that our algorithm improves cluster analysis accuracy and decreases variance. Our results show that our new algorithm produced a speedup of 76 ± 9 times compared with the serial K-Means++ and is as fast as the streaming K-Means. Our work provides a method to select a good initial seeding in less time, facilitating fast accurate cluster analysis over large datasets.
- Is Part Of:
- International journal of big data intelligence. Volume 1:Number 1/2(2014)
- Journal:
- International journal of big data intelligence
- Issue:
- Volume 1:Number 1/2(2014)
- Issue Display:
- Volume 1, Issue 1/2 (2014)
- Year:
- 2014
- Volume:
- 1
- Issue:
- 1/2
- Issue Sort Value:
- 2014-0001-NaN-0000
- Page Start:
- 50
- Page End:
- 64
- Publication Date:
- 2014-01-01
- Subjects:
- K-Means -- K-Means++ -- streaming K-Means -- SK-Means -- MapReduce
Big data -- Periodicals
005.705 - Journal URLs:
- http://www.inderscience.com/jhome.php?jcode=ijbdi ↗
http://www.inderscience.com/ ↗ - Languages:
- English
- ISSNs:
- 2053-1389
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 7308.xml