Genetic Algorithm Classifier System for Semi‐Supervised Learning. (17th September 2013)
- Record Type:
- Journal Article
- Title:
- Genetic Algorithm Classifier System for Semi‐Supervised Learning. (17th September 2013)
- Main Title:
- Genetic Algorithm Classifier System for Semi‐Supervised Learning
- Authors:
- Dee Miller, L.
Soh, Leen‐Kiat
Scott, Stephen - Abstract:
- <abstract abstract-type="main" id="coin12018-abs-0001"> <title> <x xml:space="preserve">Abstract</x> </title> <p id="coin12018-para-0001">Real‐world datasets often contain large numbers of unlabeled data points, because there is additional cost for obtaining the labels. Semi‐supervised learning (SSL) algorithms use both labeled and unlabeled data points for training that can result in higher classification accuracy on these datasets. Generally, traditional SSLs tentatively label the unlabeled data points on the basis of the smoothness assumption that neighboring points should have the same label. When this assumption is violated, unlabeled points are mislabeled injecting noise into the final classifier. An alternative SSL approach is cluster‐then‐label (CTL), which partitions all the data points (labeled and unlabeled) into clusters and creates a classifier by using those clusters. CTL is based on the less restrictive cluster assumption that data points in the same cluster should have the same label. As shown, this allows CTLs to achieve higher classification accuracy on many datasets where the cluster assumption holds for the CTLs, but smoothness does not hold for the traditional SSLs. However, cluster configuration problems (e.g., irrelevant features, insufficient clusters, and incorrectly shaped clusters) could violate the cluster assumption. We propose a new framework for CTLs by using a genetic algorithm (GA) to evolve classifiers without the cluster configuration<abstract abstract-type="main" id="coin12018-abs-0001"> <title> <x xml:space="preserve">Abstract</x> </title> <p id="coin12018-para-0001">Real‐world datasets often contain large numbers of unlabeled data points, because there is additional cost for obtaining the labels. Semi‐supervised learning (SSL) algorithms use both labeled and unlabeled data points for training that can result in higher classification accuracy on these datasets. Generally, traditional SSLs tentatively label the unlabeled data points on the basis of the smoothness assumption that neighboring points should have the same label. When this assumption is violated, unlabeled points are mislabeled injecting noise into the final classifier. An alternative SSL approach is cluster‐then‐label (CTL), which partitions all the data points (labeled and unlabeled) into clusters and creates a classifier by using those clusters. CTL is based on the less restrictive cluster assumption that data points in the same cluster should have the same label. As shown, this allows CTLs to achieve higher classification accuracy on many datasets where the cluster assumption holds for the CTLs, but smoothness does not hold for the traditional SSLs. However, cluster configuration problems (e.g., irrelevant features, insufficient clusters, and incorrectly shaped clusters) could violate the cluster assumption. We propose a new framework for CTLs by using a genetic algorithm (GA) to evolve classifiers without the cluster configuration problems (e.g., the GA removes irrelevant attributes, updates number of clusters, and changes the shape of the clusters). We demonstrate that a CTL based on this framework achieves comparable or higher accuracy with both traditional SSLs and CTLs on 12 University of California, Irvine machine learning datasets.</p> </abstract> … (more)
- Is Part Of:
- Computational intelligence. Volume 31:Number 2(2015:May)
- Journal:
- Computational intelligence
- Issue:
- Volume 31:Number 2(2015:May)
- Issue Display:
- Volume 31, Issue 2 (2015)
- Year:
- 2015
- Volume:
- 31
- Issue:
- 2
- Issue Sort Value:
- 2015-0031-0002-0000
- Page Start:
- 201
- Page End:
- 232
- Publication Date:
- 2013-09-17
- Subjects:
- Artificial intelligence -- Periodicals
Computational linguistics -- Periodicals
006.3 - Journal URLs:
- http://www.blackwellpublishing.com/journal.asp?ref=0824-7935&site=1 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1111/coin.12018 ↗
- Languages:
- English
- ISSNs:
- 0824-7935
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.595000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 3075.xml