A fast and effective partitional clustering algorithm for large categorical datasets using a k-means based approach. (May 2018)
- Record Type:
- Journal Article
- Title:
- A fast and effective partitional clustering algorithm for large categorical datasets using a k-means based approach. (May 2018)
- Main Title:
- A fast and effective partitional clustering algorithm for large categorical datasets using a k-means based approach
- Authors:
- Ben Salem, Semeh
Naouali, Sami
Chtourou, Zied - Abstract:
- Abstract: Partitional clustering algorithms represent an interesting issue in pattern recognition due to their high scalability and efficiency. The k -means, proposed since 1965, had shown great efficiency for numeric clustering but is unfortunately inadequate for categorical clustering. In 1998, the k -modes was proposed as an extension of the k -means to cluster categorical datasets. In this paper, a new categorical method based on partitions called Manhattan Frequency k -Means (MFk-M) is detailed. It aims to convert the initial categorical data into numeric values using the relative frequency of each modality in the attributes. The L1 (Manhattan distance) norm was also used as an evaluation distance measure to compute the distance between the observations and the centroids. Finally, an approximation is defined to evaluate each resulting partition during the execution of the algorithm to avoid trivial clusterings such as cluster death. Experimental analysis performed on real life datasets highlights the reduced complexity costs and high efficiency of our proposal when compared to the standard k -means and k -modes algorithms.
- Is Part Of:
- Computers & electrical engineering. Volume 68(2018)
- Journal:
- Computers & electrical engineering
- Issue:
- Volume 68(2018)
- Issue Display:
- Volume 68, Issue 2018 (2018)
- Year:
- 2018
- Volume:
- 68
- Issue:
- 2018
- Issue Sort Value:
- 2018-0068-2018-0000
- Page Start:
- 463
- Page End:
- 483
- Publication Date:
- 2018-05
- Subjects:
- Categorical clustering -- Pattern recognition -- k-means -- k-modes -- Unsupervised learning -- Crime Mining
Computer engineering -- Periodicals
Electrical engineering -- Periodicals
Electrical engineering -- Data processing -- Periodicals
Ordinateurs -- Conception et construction -- Périodiques
Électrotechnique -- Périodiques
Électrotechnique -- Informatique -- Périodiques
Computer engineering
Electrical engineering
Electrical engineering -- Data processing
Periodicals
Electronic journals
621.302854 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00457906/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compeleceng.2018.04.023 ↗
- Languages:
- English
- ISSNs:
- 0045-7906
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.680000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 6735.xml