Predicting the reproductive toxicity of chemicals using ensemble learning methods and molecular fingerprints. (1st April 2021)
- Record Type:
- Journal Article
- Title:
- Predicting the reproductive toxicity of chemicals using ensemble learning methods and molecular fingerprints. (1st April 2021)
- Main Title:
- Predicting the reproductive toxicity of chemicals using ensemble learning methods and molecular fingerprints
- Authors:
- Feng, Huawei
Zhang, Li
Li, Shimeng
Liu, Lili
Yang, Tianzhou
Yang, Pengyu
Zhao, Jian
Arkin, Isaiah Tuvia
Liu, Hongsheng - Abstract:
- Highlights: Ensemble learning models were built for predicting chemical reproductive toxicity. The best model achieved an average accuracy of 86.33 % with 5-fold cross-validation. In 5-fold cross-validation, an area under the curve (AUC) of the best model was 0.937. Some substructures related to the reproductive toxicity were achieved. Abstract: Reproductive toxicity endpoints are a significant safety concern in the assessment of the adverse effects of chemicals in drug discovery. Computational models that can accurately predict a chemical's toxic potential are increasingly pursued to replace traditional animal experiments. Thus, ensemble learning models were built to predict the reproductive toxicity of compounds. Our ensemble models were developed using support vector machine, random forest, and extreme gradient boosting methods and 9 molecular fingerprints calculated for a dataset containing 1823 chemicals. The best prediction performance was achieved by the Ensemble-Top12 model, with an accuracy (ACC) of 86.33 %, a sensitivity (SEN) of 82.02 %, a specificity (SPE) of 90.19 %, and an area under the receiver operating characteristic curve (AUC) of 0.937 in 5-fold cross-validation and ACC, SEN, SPE, and AUC values of 84.38 %, 86.90 %, 90.67 %, and 0.920, respectively, in external validation. We also defined the applicability domain (AD) of the ensemble model by calculating the Tanimoto distance of the training set. Compared with models in existing literature, our ensembleHighlights: Ensemble learning models were built for predicting chemical reproductive toxicity. The best model achieved an average accuracy of 86.33 % with 5-fold cross-validation. In 5-fold cross-validation, an area under the curve (AUC) of the best model was 0.937. Some substructures related to the reproductive toxicity were achieved. Abstract: Reproductive toxicity endpoints are a significant safety concern in the assessment of the adverse effects of chemicals in drug discovery. Computational models that can accurately predict a chemical's toxic potential are increasingly pursued to replace traditional animal experiments. Thus, ensemble learning models were built to predict the reproductive toxicity of compounds. Our ensemble models were developed using support vector machine, random forest, and extreme gradient boosting methods and 9 molecular fingerprints calculated for a dataset containing 1823 chemicals. The best prediction performance was achieved by the Ensemble-Top12 model, with an accuracy (ACC) of 86.33 %, a sensitivity (SEN) of 82.02 %, a specificity (SPE) of 90.19 %, and an area under the receiver operating characteristic curve (AUC) of 0.937 in 5-fold cross-validation and ACC, SEN, SPE, and AUC values of 84.38 %, 86.90 %, 90.67 %, and 0.920, respectively, in external validation. We also defined the applicability domain (AD) of the ensemble model by calculating the Tanimoto distance of the training set. Compared with models in existing literature, our ensemble model achieves relatively high ACC, SPE and AUC values. We also identified several fingerprint features related to chemical reproductive toxicity. Considering the performance of model, we recommend using the Ensemble-Top12 model to predict reproductive toxicity in early drug development. … (more)
- Is Part Of:
- Toxicology letters. Volume 340(2021)
- Journal:
- Toxicology letters
- Issue:
- Volume 340(2021)
- Issue Display:
- Volume 340, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 340
- Issue:
- 2021
- Issue Sort Value:
- 2021-0340-2021-0000
- Page Start:
- 4
- Page End:
- 14
- Publication Date:
- 2021-04-01
- Subjects:
- Reproductive toxicity -- Molecular fingerprint -- Machine learning -- Ensemble -- Prediction models
Toxicology -- Periodicals
363.179 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03784274 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.toxlet.2021.01.002 ↗
- Languages:
- English
- ISSNs:
- 0378-4274
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 8873.042000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 22444.xml