CYPlebrity: Machine learning models for the prediction of inhibitors of cytochrome P450 enzymes. (15th September 2021)
- Record Type:
- Journal Article
- Title:
- CYPlebrity: Machine learning models for the prediction of inhibitors of cytochrome P450 enzymes. (15th September 2021)
- Main Title:
- CYPlebrity: Machine learning models for the prediction of inhibitors of cytochrome P450 enzymes
- Authors:
- Plonka, Wojciech
Stork, Conrad
Šícho, Martin
Kirchmair, Johannes - Abstract:
- Abstract: The vast majority of approved drugs are metabolized by the five major cytochrome P450 (CYP) isozymes, 1A2, 2C9, 2C19, 2D6 and 3A4. Inhibition of CYP isozymes can cause drug-drug interactions with severe pharmacological and toxicological consequences. Computational methods for the fast and reliable prediction of the inhibition of CYP isozymes by small molecules are therefore of high interest and relevance to pharmaceutical companies and a host of other industries, including the cosmetics and agrochemical industries. Today, a large number of machine learning models for predicting the inhibition of the major CYP isozymes by small molecules are available. With this work we aim to go beyond the coverage of existing models, by combining data from several major public and proprietary sources. More specifically, we used up to 18815 compounds with measured bioactivities to train random forest classification models for the individual CYP isozymes. A major advantage of the new data collection over existing ones is the better representation of the minority class, the CYP inhibitors. With the new data collection we achieved inhibitor-to-non-inhibitor ratios in the order of 1:1 (CYP1A2) to 1:3 (CYP2D6). We show that our models reach competitive performance on external data, with Matthews correlation coefficients (MCCs) ranging from 0.62 (CYP2C19) to 0.70 (CYP2D6), and areas under the receiver operating characteristic curve (AUCs) between 0.89 (CYP2C19) and 0.92 (CYPs 2D6 andAbstract: The vast majority of approved drugs are metabolized by the five major cytochrome P450 (CYP) isozymes, 1A2, 2C9, 2C19, 2D6 and 3A4. Inhibition of CYP isozymes can cause drug-drug interactions with severe pharmacological and toxicological consequences. Computational methods for the fast and reliable prediction of the inhibition of CYP isozymes by small molecules are therefore of high interest and relevance to pharmaceutical companies and a host of other industries, including the cosmetics and agrochemical industries. Today, a large number of machine learning models for predicting the inhibition of the major CYP isozymes by small molecules are available. With this work we aim to go beyond the coverage of existing models, by combining data from several major public and proprietary sources. More specifically, we used up to 18815 compounds with measured bioactivities to train random forest classification models for the individual CYP isozymes. A major advantage of the new data collection over existing ones is the better representation of the minority class, the CYP inhibitors. With the new data collection we achieved inhibitor-to-non-inhibitor ratios in the order of 1:1 (CYP1A2) to 1:3 (CYP2D6). We show that our models reach competitive performance on external data, with Matthews correlation coefficients (MCCs) ranging from 0.62 (CYP2C19) to 0.70 (CYP2D6), and areas under the receiver operating characteristic curve (AUCs) between 0.89 (CYP2C19) and 0.92 (CYPs 2D6 and 3A4). Importantly, the models show a high level of robustness, reflected in a good predictivity also for compounds that are structurally dissimilar to the compounds represented in the training data. The best models presented in this work are freely accessible for academic research via a web service. … (more)
- Is Part Of:
- Bioorganic & medicinal chemistry. Volume 46(2021)
- Journal:
- Bioorganic & medicinal chemistry
- Issue:
- Volume 46(2021)
- Issue Display:
- Volume 46, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 46
- Issue:
- 2021
- Issue Sort Value:
- 2021-0046-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-09-15
- Subjects:
- Drug metabolism -- Cytochrome P450 inhibition -- Metabolism prediction -- Machine learning
AID (PubChem) assay identifier -- AUC area under curve -- BA balanced accuracy -- CSV comma separated value (computer file) -- CV cross validation -- CYP cytochrome P450 -- Da Dalton (molecular weight unit) -- ECFP extended connectivity fingerprint -- FCFP functional-class fingerprint -- kNN k-nearest neighbor -- MACCS molecular access system -- MCC Matthew's correlation coefficient -- NN Neural network -- PPV positive predictive value -- RF random forest -- ROC receiver operating characteristic -- QSAR quantitative structure activity relationship -- SD structure data (computer file) -- SVM support vector machine -- TC Tanimoto coefficient -- TPR true positive rate -- TNR true negative rate
Bioorganic chemistry -- Periodicals
Pharmaceutical chemistry -- Periodicals
Biochemistry -- Periodicals
Chemistry, Clinical -- Periodicals
Chemistry, Organic -- Periodicals
Chimie bio-organique -- Périodiques
Chimie pharmaceutique -- Périodiques
615.19 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09680896 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.bmc.2021.116388 ↗
- Languages:
- English
- ISSNs:
- 0968-0896
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 2089.325000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 23857.xml