Using Active Learning to Develop Machine Learning Models for Reaction Yield Prediction. Issue 12 (14th July 2022)
- Record Type:
- Journal Article
- Title:
- Using Active Learning to Develop Machine Learning Models for Reaction Yield Prediction. Issue 12 (14th July 2022)
- Main Title:
- Using Active Learning to Develop Machine Learning Models for Reaction Yield Prediction
- Authors:
- Viet Johansson, Simon
Gummesson Svensson, Hampus
Bjerrum, Esben
Schliep, Alexander
Haghir Chehreghani, Morteza
Tyrchan, Christian
Engkvist, Ola - Abstract:
- Abstract: Computer aided synthesis planning, suggesting synthetic routes for molecules of interest, is a rapidly growing field. The machine learning methods used are often dependent on access to large datasets for training, but finite experimental budgets limit how much data can be obtained from experiments. This suggests the use of schemes for data collection such as active learning, which identifies the data points of highest impact for model accuracy, and which has been used in recent studies with success. However, little has been done to explore the robustness of the methods predicting reaction yield when used together with active learning to reduce the amount of experimental data needed for training. This study aims to investigate the influence of machine learning algorithms and the number of initial data points on reaction yield prediction for two public high‐throughput experimentation datasets. Our results show that active learning based on output margin reached a pre‐defined AUROC faster than random sampling on both datasets. Analysis of feature importance of the trained machine learning models suggests active learning had a larger influence on the model accuracy when only a few features were important for the model prediction. Abstract :
- Is Part Of:
- Molecular informatics. Volume 41:Issue 12(2022)
- Journal:
- Molecular informatics
- Issue:
- Volume 41:Issue 12(2022)
- Issue Display:
- Volume 41, Issue 12 (2022)
- Year:
- 2022
- Volume:
- 41
- Issue:
- 12
- Issue Sort Value:
- 2022-0041-0012-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2022-07-14
- Subjects:
- Active Learning -- Reaction Yield Prediction -- Bayesian Matrix Factorization -- Random Forest -- Neural Networks
Cheminformatics -- Periodicals
QSAR (Biochemistry) -- Periodicals
Structure-activity relationships (Biochemistry) -- Periodicals
Drugs -- Structure-activity relationships -- Periodicals
615.19 - Journal URLs:
- http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)1868-1751 ↗
http://www3.interscience.wiley.com/journal/123236613/home ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1002/minf.202200043 ↗
- Languages:
- English
- ISSNs:
- 1868-1743
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 5900.817750
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 24680.xml