Conditional information gain networks as sparse mixture of experts. (December 2021)

Record Type:: Journal Article
Title:: Conditional information gain networks as sparse mixture of experts. (December 2021)
Main Title:: Conditional information gain networks as sparse mixture of experts
Authors:: Bicici, Ufuk Can
Akarun, Lale
Abstract:: Abstract: Deep neural network models owe their representational power and high performance in classification tasks to the high number of learnable parameters. Running deep neural network models in limited-resource environments is a problematic task. Models employing conditional computing aim to reduce the computational burden while retaining model performance on par with more complex neural network models. This paper, proposes a new model, Conditional Information Gain Networks as Sparse Mixture of Experts (sMoE-CIGNs). A CIGN model is a neural tree that allows conditionally skipping parts of the tree based on routing mechanisms inserted into the architecture. These routing mechanisms are based on differentiable Information Gain objectives. CIGN groups semantically similar samples in the leaves, enabling simpler classifiers to focus on differentiating between similar classes. This lets the CIGN model attain high classification performances with lighter models. We further improve the basic CIGN model by proposing a sparse mixture of experts model for difficult to classify samples that may get routed to suboptimal branches. If a sample has routing confidence higher than a specific threshold, the sample may be routed towards multiple child nodes. The classification decision can then be taken as a mixture of these expert decisions. We learn the optimal routing thresholds by Bayesian Optimization over a validation set by minimizing a weighted loss, including the classification … (more)
Is Part Of:: Pattern recognition. Volume 120(2021)
Journal:: Pattern recognition
Issue:: Volume 120(2021)
Issue Display:: Volume 120, Issue 2021 (2021)
Year:: 2021
Volume:: 120
Issue:: 2021
Issue Sort Value:: 2021-0120-2021-0000
Page Start:
Page End:
Publication Date:: 2021-12
Subjects:: Machine learning -- Deep learning -- Conditional deep learning
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4
Journal URLs:: http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗
DOI:: 10.1016/j.patcog.2021.108151 ↗
Languages:: English
ISSNs:: 0031-3203
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 18489.xml