Size‐Extensive Molecular Machine Learning with Global Representations. Issue 4 (4th February 2020)
- Record Type:
- Journal Article
- Title:
- Size‐Extensive Molecular Machine Learning with Global Representations. Issue 4 (4th February 2020)
- Main Title:
- Size‐Extensive Molecular Machine Learning with Global Representations
- Authors:
- Jung, Hyunwook
Stocker, Sina
Kunkel, Christian
Oberhofer, Harald
Han, Byungchan
Reuter, Karsten
Margraf, Johannes T. - Abstract:
- Abstract: Machine learning (ML) models are increasingly used in combination with electronic structure calculations to predict molecular properties at a much lower computational cost in high‐throughput settings. Such ML models require representations that encode the molecular structure, which are generally designed to respect the symmetries and invariances of the target property. However, size‐extensivity is usually not guaranteed for so‐called global representations. In this contribution, we show how extensivity can be built into global ML models using, e. g ., the Many‐Body Tensor Representation. Properties of extensive and non‐extensive models for the atomization energy are systematically explored by training on small molecules and testing on small, medium and large molecules. Our results show that non‐extensive models are only useful in the size‐range of their training set, whereas extensive models provide reasonable predictions across large size differences. Remaining sources of error for extensive models are discussed. Abstract : A sizeable difference : The machine‐learning (ML) based exploration of chemical space requires models that can appropriately handle molecules of different sizes. To achieve this, the size‐extensivity of ML models should be enforced. In this paper conditions for extensive ML models are discussed, and extensive models are shown to effectively extrapolate to large molecules, when trained on small ones.
- Is Part Of:
- ChemSystemsChem. Volume 2:Issue 4(2020)
- Journal:
- ChemSystemsChem
- Issue:
- Volume 2:Issue 4(2020)
- Issue Display:
- Volume 2, Issue 4 (2020)
- Year:
- 2020
- Volume:
- 2
- Issue:
- 4
- Issue Sort Value:
- 2020-0002-0004-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2020-02-04
- Subjects:
- Machine learning -- Kernel ridge regression -- Many-body tensor representation -- Size-extensivity -- Atomization energy
Synthetic biology -- Periodicals
Artificial cells -- Periodicals
Chemical systems -- Periodicals
Biochemistry -- Periodicals
Biotechnology -- Periodicals
572 - Journal URLs:
- http://onlinelibrary.wiley.com/ ↗
- DOI:
- 10.1002/syst.201900052 ↗
- Languages:
- English
- ISSNs:
- 2570-4206
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3172.319800
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 13353.xml