Fuzzy C-means++: Fuzzy C-means with effective seeding initialization. Issue 21 (30th November 2015)
- Record Type:
- Journal Article
- Title:
- Fuzzy C-means++: Fuzzy C-means with effective seeding initialization. Issue 21 (30th November 2015)
- Main Title:
- Fuzzy C-means++: Fuzzy C-means with effective seeding initialization
- Authors:
- Stetco, Adrian
Zeng, Xiao-Jun
Keane, John - Abstract:
- Highlights: Paper proposes the Fuzzy C-means++ method for improving the effectiveness and speed of the Fuzzy C-means algorithm. This method works by spreading the initial cluster representatives in the data space at initialization. The proposed algorithm achieves superior results on both artificially generated and real world data sets. Abstract: Fuzzy C-means has been utilized successfully in a wide range of applications, extending the clustering capability of the K-means to datasets that are uncertain, vague and otherwise hard to cluster. This paper introduces the Fuzzy C-means++ algorithm which, by utilizing the seeding mechanism of the K-means++ algorithm, improves the effectiveness and speed of Fuzzy C-means. By careful seeding that disperses the initial cluster centers through the data space, the resulting Fuzzy C-means++ approach samples starting cluster representatives during the initialization phase. The cluster representatives are well spread in the input space, resulting in both faster convergence times and higher quality solutions. Implementations in R of standard Fuzzy C-means and Fuzzy C-means++ are evaluated on various data sets. We investigate the cluster quality and iteration count as we vary the spreading factor on a series of synthetic data sets. We run the algorithm on real world data sets and to account for the non-determinism inherent in these algorithms we record multiple runs while choosing different k parameter values. The results show that theHighlights: Paper proposes the Fuzzy C-means++ method for improving the effectiveness and speed of the Fuzzy C-means algorithm. This method works by spreading the initial cluster representatives in the data space at initialization. The proposed algorithm achieves superior results on both artificially generated and real world data sets. Abstract: Fuzzy C-means has been utilized successfully in a wide range of applications, extending the clustering capability of the K-means to datasets that are uncertain, vague and otherwise hard to cluster. This paper introduces the Fuzzy C-means++ algorithm which, by utilizing the seeding mechanism of the K-means++ algorithm, improves the effectiveness and speed of Fuzzy C-means. By careful seeding that disperses the initial cluster centers through the data space, the resulting Fuzzy C-means++ approach samples starting cluster representatives during the initialization phase. The cluster representatives are well spread in the input space, resulting in both faster convergence times and higher quality solutions. Implementations in R of standard Fuzzy C-means and Fuzzy C-means++ are evaluated on various data sets. We investigate the cluster quality and iteration count as we vary the spreading factor on a series of synthetic data sets. We run the algorithm on real world data sets and to account for the non-determinism inherent in these algorithms we record multiple runs while choosing different k parameter values. The results show that the proposed method gives significant improvement in convergence times (the number of iterations) of up to 40 (2.1 on average) times the standard on synthetic datasets and, in general, an associated lower cost function value and Xie–Beni value. A proof sketch of the logarithmically bounded expected cost function value is given. … (more)
- Is Part Of:
- Expert systems with applications. Volume 42:Issue 21(2015)
- Journal:
- Expert systems with applications
- Issue:
- Volume 42:Issue 21(2015)
- Issue Display:
- Volume 42, Issue 21 (2015)
- Year:
- 2015
- Volume:
- 42
- Issue:
- 21
- Issue Sort Value:
- 2015-0042-0021-0000
- Page Start:
- 7541
- Page End:
- 7548
- Publication Date:
- 2015-11-30
- Subjects:
- Cluster analysis -- Fuzzy C-means clustering -- Initialization
Expert systems (Computer science) -- Periodicals
Systèmes experts (Informatique) -- Périodiques
Electronic journals
006.33 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09574174 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.eswa.2015.05.014 ↗
- Languages:
- English
- ISSNs:
- 0957-4174
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3842.004220
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12961.xml