A Bayesian criterion for cluster stability. (6th February 2013)
- Record Type:
- Journal Article
- Title:
- A Bayesian criterion for cluster stability. (6th February 2013)
- Main Title:
- A Bayesian criterion for cluster stability
- Authors:
- Koepke, Hoyt
Clarke, Bertrand - Abstract:
- <abstract abstract-type="main" xml:lang="en"> <title>Abstract</title> <p>We present a technique for evaluating and comparing how clusterings reveal structure inherent in the data set. Our technique is based on a criterion evaluating how much point‐to‐cluster distances may be perturbed without affecting the membership of the points. Although similar to some existing perturbation methods, our approach distinguishes itself in five ways. First, the strength of the perturbations is indexed by a prior distribution controlling how close to boundary regions a point may be before it is considered unstable. Second, our approach is exact in that we integrate over all the perturbations; in practice, this can be done efficiently for well‐chosen prior distributions. Third, we provide a rigorous theoretical treatment of the approach, showing that it is consistent for estimating the correct number of clusters. Fourth, it yields a detailed picture of the behavior and structure of the clustering. Finally, it is computationally tractable and easy to use, requiring only a point‐to‐cluster distance matrix as input. In a simulation study, we show that it outperforms several existing methods in terms of recovering the correct number of clusters. We also illustrate the technique in three real data sets. © 2013 Wiley Periodicals, Inc. Statistical Analysis and Data Mining, 2013</p> </abstract>
- Is Part Of:
- Statistical analysis and data mining. Volume 6:Number 4(2013)
- Journal:
- Statistical analysis and data mining
- Issue:
- Volume 6:Number 4(2013)
- Issue Display:
- Volume 6, Issue 4 (2013)
- Year:
- 2013
- Volume:
- 6
- Issue:
- 4
- Issue Sort Value:
- 2013-0006-0004-0000
- Page Start:
- 346
- Page End:
- 374
- Publication Date:
- 2013-02-06
- Subjects:
- Data mining -- Statistical methods -- Periodicals
006.312 - Journal URLs:
- http://www3.interscience.wiley.com/journal/112701062/home ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1002/sam.11176 ↗
- Languages:
- English
- ISSNs:
- 1932-1864
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 8447.424100
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 3066.xml