Automatic clustering by multi-objective genetic algorithm with numeric and categorical features. (15th December 2019)
- Record Type:
- Journal Article
- Title:
- Automatic clustering by multi-objective genetic algorithm with numeric and categorical features. (15th December 2019)
- Main Title:
- Automatic clustering by multi-objective genetic algorithm with numeric and categorical features
- Authors:
- Dutta, Dipankar
Sil, Jaya
Dutta, Paramartha - Abstract:
- Highlights: We have developed a clustering algorithm for an unknown number of clusters by MOGA. It works with continuous and categorical featured data sets. It can work with data sets having missing values. The final solution is selected by majority vote by all non-dominated solutions. Context-sensitive and cluster-orient genetic operators are designed. Abstract: Many clustering algorithms categorized as K -clustering algorithm require the user to predict the number of clusters ( K ) to do clustering. Due to lack of domain knowledge an accurate value of K is difficult to predict. The problem becomes critical when the dimensionality of data points is large; clusters differ widely in shape, size, and density; and when clusters are overlapping in nature. Determining the suitable K is an optimization problem. Automatic clustering algorithms can discover the optimal K . This paper presents an automatic clustering algorithm which is superior to K -clustering algorithm as it can discover an optimal value of K . Iterative hill-climbing algorithms like K -Means work on a single solution and converge to a local optimum solution. Here, Genetic Algorithms ( GA s) find out near global optimum solutions, i.e. optimal K as well as the optimal cluster centroids. Single-objective clustering algorithms are adequate for efficiently grouping linearly separable clusters. For non-linearly separable clusters they are not so good. So for grouping non-linearly separable clusters, we applyHighlights: We have developed a clustering algorithm for an unknown number of clusters by MOGA. It works with continuous and categorical featured data sets. It can work with data sets having missing values. The final solution is selected by majority vote by all non-dominated solutions. Context-sensitive and cluster-orient genetic operators are designed. Abstract: Many clustering algorithms categorized as K -clustering algorithm require the user to predict the number of clusters ( K ) to do clustering. Due to lack of domain knowledge an accurate value of K is difficult to predict. The problem becomes critical when the dimensionality of data points is large; clusters differ widely in shape, size, and density; and when clusters are overlapping in nature. Determining the suitable K is an optimization problem. Automatic clustering algorithms can discover the optimal K . This paper presents an automatic clustering algorithm which is superior to K -clustering algorithm as it can discover an optimal value of K . Iterative hill-climbing algorithms like K -Means work on a single solution and converge to a local optimum solution. Here, Genetic Algorithms ( GA s) find out near global optimum solutions, i.e. optimal K as well as the optimal cluster centroids. Single-objective clustering algorithms are adequate for efficiently grouping linearly separable clusters. For non-linearly separable clusters they are not so good. So for grouping non-linearly separable clusters, we apply Multi-Objective Genetic Algorithm ( MOGA ) by minimizing the intra-cluster distance and maximizing inter-cluster distance. Many existing MOGA based clustering algorithms are suitable for either numeric or categorical features. This paper pioneered employing MOGA for automatic clustering with mixed types of features. Statistical testing on experimental results on real-life benchmark data sets from the University of California at Irvine ( UCI ) machine learning repository proves the superiority of the proposed algorithm. … (more)
- Is Part Of:
- Expert systems with applications. Volume 137(2019)
- Journal:
- Expert systems with applications
- Issue:
- Volume 137(2019)
- Issue Display:
- Volume 137, Issue 2019 (2019)
- Year:
- 2019
- Volume:
- 137
- Issue:
- 2019
- Issue Sort Value:
- 2019-0137-2019-0000
- Page Start:
- 357
- Page End:
- 379
- Publication Date:
- 2019-12-15
- Subjects:
- Automatic clustering -- Multi-Objective Genetic Algorithm (MOGA) -- Pareto approach -- Statistical test
Expert systems (Computer science) -- Periodicals
Systèmes experts (Informatique) -- Périodiques
Electronic journals
006.33 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09574174 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.eswa.2019.06.056 ↗
- Languages:
- English
- ISSNs:
- 0957-4174
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3842.004220
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 11529.xml