This is an interim version of our Electronic Legal Deposit Catalogue-eJournals and eBooks while we continue to recover from a cyber-attack.
The committee machine: computational to statistical gaps in learning a two-layers neural network*This is the original and extended version of: Aubin B, Maillard A, Barbier J, Krzakala F, Macris N and Zdeborová L 2018 The committee machine: Computational to statistical gaps in learning a two-layers neural network Advances in Neural Information Processing Systems 31 ed S Bengio et al (Red Hook, NY: Curran Associates, Inc) pp 3223–34. (20th December 2019)
Record Type:
Journal Article
Title:
The committee machine: computational to statistical gaps in learning a two-layers neural network*This is the original and extended version of: Aubin B, Maillard A, Barbier J, Krzakala F, Macris N and Zdeborová L 2018 The committee machine: Computational to statistical gaps in learning a two-layers neural network Advances in Neural Information Processing Systems 31 ed S Bengio et al (Red Hook, NY: Curran Associates, Inc) pp 3223–34. (20th December 2019)
Main Title:
The committee machine: computational to statistical gaps in learning a two-layers neural network*This is the original and extended version of: Aubin B, Maillard A, Barbier J, Krzakala F, Macris N and Zdeborová L 2018 The committee machine: Computational to statistical gaps in learning a two-layers neural network Advances in Neural Information Processing Systems 31 ed S Bengio et al (Red Hook, NY: Curran Associates, Inc) pp 3223–34.
Abstract: Heuristic tools from statistical physics have been used in the past to locate the phase transitions and compute the optimal learning and generalization errors in the teacher-student scenario in multi-layer neural networks. In this paper, we provide a rigorous justification of these approaches for a two-layers neural network model called the committee machine, under a technical assumption. We also introduce a version of the approximate message passing (AMP) algorithm for the committee machine that allows optimal learning in polynomial time for a large set of parameters. We find that there are regimes in which a low generalization error is information-theoretically achievable while the AMP algorithm fails to deliver it; strongly suggesting that no efficient algorithm exists for those cases, unveiling a large computational gap.