An efficient k‐modes algorithm for clustering categorical datasets. (18th September 2021)
- Record Type:
- Journal Article
- Title:
- An efficient k‐modes algorithm for clustering categorical datasets. (18th September 2021)
- Main Title:
- An efficient k‐modes algorithm for clustering categorical datasets
- Authors:
- Dorman, Karin S.
Maitra, Ranjan - Abstract:
- Abstract: Mining clusters from data is an important endeavor in many applications. The k ‐means method is a popular, efficient, and distribution‐free approach for clustering numerical‐valued data, but does not apply for categorical‐valued observations. The k ‐modes method addresses this lacuna by replacing the Euclidean with the Hamming distance and the means with the modes in the k ‐means objective function. We provide a novel, computationally efficient implementation of k ‐modes, called Optimal Transfer Quick Transfer (OTQT). We prove that OTQT finds updates to improve the objective function that are undetectable to existing k ‐modes algorithms. Although slightly slower per iteration due to algorithmic complexity, OTQT is always more accurate and almost always faster (and only barely slower on some datasets) to the final optimum. Thus, we recommend OTQT as the preferred, default algorithm for k ‐modes optimization.
- Is Part Of:
- Statistical analysis and data mining. Volume 15:Number 1(2022)
- Journal:
- Statistical analysis and data mining
- Issue:
- Volume 15:Number 1(2022)
- Issue Display:
- Volume 15, Issue 1 (2022)
- Year:
- 2022
- Volume:
- 15
- Issue:
- 1
- Issue Sort Value:
- 2022-0015-0001-0000
- Page Start:
- 83
- Page End:
- 97
- Publication Date:
- 2021-09-18
- Subjects:
- categorical data clustering -- k‐modes -- OT algorithm -- OTQT algorithm
Data mining -- Statistical methods -- Periodicals
006.312 - Journal URLs:
- http://www3.interscience.wiley.com/journal/112701062/home ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1002/sam.11546 ↗
- Languages:
- English
- ISSNs:
- 1932-1864
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 8447.424100
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20328.xml