A distributed framework for trimmed Kernel k-Means clustering. Issue 8 (August 2015)
- Record Type:
- Journal Article
- Title:
- A distributed framework for trimmed Kernel k-Means clustering. Issue 8 (August 2015)
- Main Title:
- A distributed framework for trimmed Kernel k-Means clustering
- Authors:
- Tsapanos, Nikolaos
Tefas, Anastasios
Nikolaidis, Nikolaos
Pitas, Ioannis - Abstract:
- <abstract abstract-type="author" id="ab0005"> <title id="sect0005">Abstract</title> <sec> <p id="sp0075">Data clustering is an unsupervised learning task that has found many applications in various scientific fields. The goal is to find subgroups of closely related data samples (clusters) in a set of unlabeled data. Kernel <italic>k</italic>-Means is a state of the art clustering algorithm. However, in contrast to clustering algorithms that can work using only a limited percentage of the data at a time, Kernel <italic>k</italic>-Means is a global clustering algorithm. It requires the computation of the kernel matrix, which takes <inline-formula><alternatives><inline-graphic xlink:href="ark:/27927/pgjjxkjt68" xlink:type="simple" xmlns:xlink="http://www.w3.org/1999/xlink" /><mml:math altimg="si0018.gif" overflow="scroll" id="d13e607" xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:msup><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mi>d</mml:mi><mml:mo>)</mml:mo></mml:math></alternatives></inline-formula> time and <inline-formula><alternatives><inline-graphic xlink:href="ark:/27927/pgjjxkm560" xlink:type="simple" xmlns:xlink="http://www.w3.org/1999/xlink" /><mml:math altimg="si0019.gif" overflow="scroll" id="d13e624"<abstract abstract-type="author" id="ab0005"> <title id="sect0005">Abstract</title> <sec> <p id="sp0075">Data clustering is an unsupervised learning task that has found many applications in various scientific fields. The goal is to find subgroups of closely related data samples (clusters) in a set of unlabeled data. Kernel <italic>k</italic>-Means is a state of the art clustering algorithm. However, in contrast to clustering algorithms that can work using only a limited percentage of the data at a time, Kernel <italic>k</italic>-Means is a global clustering algorithm. It requires the computation of the kernel matrix, which takes <inline-formula><alternatives><inline-graphic xlink:href="ark:/27927/pgjjxkjt68" xlink:type="simple" xmlns:xlink="http://www.w3.org/1999/xlink" /><mml:math altimg="si0018.gif" overflow="scroll" id="d13e607" xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:msup><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mi>d</mml:mi><mml:mo>)</mml:mo></mml:math></alternatives></inline-formula> time and <inline-formula><alternatives><inline-graphic xlink:href="ark:/27927/pgjjxkm560" xlink:type="simple" xmlns:xlink="http://www.w3.org/1999/xlink" /><mml:math altimg="si0019.gif" overflow="scroll" id="d13e624" xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:msup><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:math></alternatives></inline-formula> space in memory. As datasets grow larger, the application of Kernel <italic>k</italic>-Means becomes infeasible on a single computer, a fact that strongly suggests a distributed approach. In this paper, we present such an approach to the Kernel <italic>k</italic>-Means clustering algorithm, in order to make its application to a large number of samples feasible and, thus, achieve high performance clustering results on very big datasets. Our distributed approach follows the MapReduce programming model and consists of 3 stages, the kernel matrix computation, a novel kernel matrix trimming method and the Kernel <italic>k</italic>-Means clustering algorithm.</p> </sec> </abstract> … (more)
- Is Part Of:
- Pattern recognition. Volume 48:Issue 8(2015:Aug.)
- Journal:
- Pattern recognition
- Issue:
- Volume 48:Issue 8(2015:Aug.)
- Issue Display:
- Volume 48, Issue 8 (2015)
- Year:
- 2015
- Volume:
- 48
- Issue:
- 8
- Issue Sort Value:
- 2015-0048-0008-0000
- Page Start:
- 2685
- Page End:
- 2698
- Publication Date:
- 2015-08
- Subjects:
- Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2015.02.020 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 3853.xml