In Silico Prediction of Blood–Brain Barrier Permeability of Compounds by Machine Learning and Resampling Methods. (21st September 2018)
- Record Type:
- Journal Article
- Title:
- In Silico Prediction of Blood–Brain Barrier Permeability of Compounds by Machine Learning and Resampling Methods. (21st September 2018)
- Main Title:
- In Silico Prediction of Blood–Brain Barrier Permeability of Compounds by Machine Learning and Resampling Methods
- Authors:
- Wang, Zhuang
Yang, Hongbin
Wu, Zengrui
Wang, Tianduanyi
Li, Weihua
Tang, Yun
Liu, Guixia - Abstract:
- Abstract: The blood–brain barrier (BBB) as a part of absorption protects the central nervous system by separating the brain tissue from the bloodstream. In recent years, BBB permeability has become a critical issue in chemical ADMET prediction, but almost all models were built using imbalanced data sets, which caused a high false‐positive rate. Therefore, we tried to solve the problem of biased data sets and built a reliable classification model with 2358 compounds. Machine learning and resampling methods were used simultaneously for the refinement of models with both 2 D molecular descriptors and molecular fingerprints to represent the chemicals. Through a series of evaluation, we realized that resampling methods such as Synthetic Minority Oversampling Technique (SMOTE) and SMOTE+edited nearest neighbor could effectively solve the problem of imbalanced data sets and that MACCS fingerprint combined with support vector machine performed the best. After the final construction of a consensus model, the overall accuracy rate was increased to 0.966 for the final external data set. Also, the accuracy rate of the model for the test set was 0.919, with an excellent balanced capacity of 0.925 (sensitivity) to predict BBB‐positive compounds and of 0.899 (specificity) to predict BBB‐negative compounds. Compared with other BBB classification models, our models reduced the rate of false positives and were more robust in prediction of BBB‐positive as well as BBB‐negative compounds, whichAbstract: The blood–brain barrier (BBB) as a part of absorption protects the central nervous system by separating the brain tissue from the bloodstream. In recent years, BBB permeability has become a critical issue in chemical ADMET prediction, but almost all models were built using imbalanced data sets, which caused a high false‐positive rate. Therefore, we tried to solve the problem of biased data sets and built a reliable classification model with 2358 compounds. Machine learning and resampling methods were used simultaneously for the refinement of models with both 2 D molecular descriptors and molecular fingerprints to represent the chemicals. Through a series of evaluation, we realized that resampling methods such as Synthetic Minority Oversampling Technique (SMOTE) and SMOTE+edited nearest neighbor could effectively solve the problem of imbalanced data sets and that MACCS fingerprint combined with support vector machine performed the best. After the final construction of a consensus model, the overall accuracy rate was increased to 0.966 for the final external data set. Also, the accuracy rate of the model for the test set was 0.919, with an excellent balanced capacity of 0.925 (sensitivity) to predict BBB‐positive compounds and of 0.899 (specificity) to predict BBB‐negative compounds. Compared with other BBB classification models, our models reduced the rate of false positives and were more robust in prediction of BBB‐positive as well as BBB‐negative compounds, which would be quite helpful in early drug discovery. Abstract : Beauty with brain : Data curation, feature selection, machine learning algorithms, resampling methods, consensus modeling techniques, and applicability domain analysis are used in a combinatorial manner to build classification models to estimate the blood–brain barrier permeability of chemical compounds. In addition, small molecules retrieved from DrugBank are assessed to help evaluate the best consensus model with high sensitivity and specificity values. … (more)
- Is Part Of:
- ChemMedChem. Volume 13:Number 20(2018)
- Journal:
- ChemMedChem
- Issue:
- Volume 13:Number 20(2018)
- Issue Display:
- Volume 13, Issue 20 (2018)
- Year:
- 2018
- Volume:
- 13
- Issue:
- 20
- Issue Sort Value:
- 2018-0013-0020-0000
- Page Start:
- 2189
- Page End:
- 2201
- Publication Date:
- 2018-09-21
- Subjects:
- blood–brain barrier -- imbalanced data -- machine learning -- QSAR models -- resampling methods
Pharmaceutical chemistry -- Periodicals
615.19005 - Journal URLs:
- http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)1860-7187 ↗
http://www3.interscience.wiley.com/cgi-bin/jhome/110485305 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1002/cmdc.201800533 ↗
- Languages:
- English
- ISSNs:
- 1860-7179
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3172.254000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 11228.xml