Reliable CA-(Q)SAR generation based on entropy weight optimized by grid search and correction factors. (July 2022)
- Record Type:
- Journal Article
- Title:
- Reliable CA-(Q)SAR generation based on entropy weight optimized by grid search and correction factors. (July 2022)
- Main Title:
- Reliable CA-(Q)SAR generation based on entropy weight optimized by grid search and correction factors
- Authors:
- Yang, Jin-Rong
Chen, Qiang
Wang, Hao
Hu, Xu-Yang
Guo, Ya-Min
Chen, Jian-Zhong - Abstract:
- Abstract: Chromosome aberration (CA) is a serious genotoxicity of a compound, leading to carcinogenicity and developmental side effects. In the present manuscript, we developed a QSAR model for CA prediction using artificial intelligence methodologies. The reliable QSAR model was constructed based on an enlarged data set of 3208 compounds by optimizing machine learning and deep learning algorithms based on hyperparametric iterations and using multiple descriptors of molecular fingerprint in combination with drug-like molecular properties (MP) screened by entropy weight methodology on the open-source Python platform. Furthermore, molecular similarity for returning search and molecular connection index for additional descriptor were additionally introduced to differentiate the compounds with high similarity for correct CA prediction for QSAR model generation. The final generated CA-(Q)SAR model exhibited good prediction accuracy of 80.6%. The bias of the final model is about 0.9793. On the basis of generated QSAR model, data analyses were further performed to analyze the typical structure features in numerical intervals (MPI) of molecular properties MW, XlogP, and TPSA, respectively, for potential CA or non-CA toxicity with a normalized occurrence probability (NOP) more than 70%, which may provide useful clues for drug design of leads or candidate devoid of CA genotoxicity. Graphical abstract: Image 1 Highlights: The CA-QSAR model was constructed based on an enlarged data setAbstract: Chromosome aberration (CA) is a serious genotoxicity of a compound, leading to carcinogenicity and developmental side effects. In the present manuscript, we developed a QSAR model for CA prediction using artificial intelligence methodologies. The reliable QSAR model was constructed based on an enlarged data set of 3208 compounds by optimizing machine learning and deep learning algorithms based on hyperparametric iterations and using multiple descriptors of molecular fingerprint in combination with drug-like molecular properties (MP) screened by entropy weight methodology on the open-source Python platform. Furthermore, molecular similarity for returning search and molecular connection index for additional descriptor were additionally introduced to differentiate the compounds with high similarity for correct CA prediction for QSAR model generation. The final generated CA-(Q)SAR model exhibited good prediction accuracy of 80.6%. The bias of the final model is about 0.9793. On the basis of generated QSAR model, data analyses were further performed to analyze the typical structure features in numerical intervals (MPI) of molecular properties MW, XlogP, and TPSA, respectively, for potential CA or non-CA toxicity with a normalized occurrence probability (NOP) more than 70%, which may provide useful clues for drug design of leads or candidate devoid of CA genotoxicity. Graphical abstract: Image 1 Highlights: The CA-QSAR model was constructed based on an enlarged data set using multiple descriptors with algorithm optimization. T c for returning search and MCI as additional descriptor were introduced for improving reliability of QSAR. Data analyses identified structure features in molecular properties intervals for CA or non-CA activity. The QSAR model may provide useful clues for drug design of leads or candidate devoid of CA genotoxicity. … (more)
- Is Part Of:
- Computers in biology and medicine. Volume 146(2022)
- Journal:
- Computers in biology and medicine
- Issue:
- Volume 146(2022)
- Issue Display:
- Volume 146, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 146
- Issue:
- 2022
- Issue Sort Value:
- 2022-0146-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-07
- Subjects:
- QSAR -- Chromosome aberration -- Machine learning -- Connection index -- Similarity return search
CA Chromosome Aberration -- MP Molecular Property -- NOP Normalized Occurrence Probability -- QSAR Quantitative Structure-Activity Relationship -- LSE Lead Scope Enterprise -- CU Case Ultra -- FP Fingerprint -- MW Molecular Weight -- TPSA Topological Polar Surface Area -- MCI Molecular Connectivity Index -- HBD H-bond Donor -- HBA H-bond Acceptor -- RBC Rotatable Bond -- ACC Prediction Accuracy -- PAH Polycyclic Aromatic Hydrocarbons
Medicine -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
610.285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00104825/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiomed.2022.105573 ↗
- Languages:
- English
- ISSNs:
- 0010-4825
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.880000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 21661.xml