EBOLApred: A machine learning-based web application for predicting cell entry inhibitors of the Ebola virus. (December 2022)
- Record Type:
- Journal Article
- Title:
- EBOLApred: A machine learning-based web application for predicting cell entry inhibitors of the Ebola virus. (December 2022)
- Main Title:
- EBOLApred: A machine learning-based web application for predicting cell entry inhibitors of the Ebola virus
- Authors:
- Adams, Joseph
Agyenkwa-Mawuli, Kwasi
Agyapong, Odame
Wilson, Michael D.
Kwofie, Samuel K. - Abstract:
- Abstract: Ebola virus disease (EVD) is a highly virulent and often lethal illness that affects humans through contact with the body fluid of infected persons. Glycoprotein and matrix protein VP40 play essential roles in the virus life cycle within the host. Whilst glycoprotein mediates the entry and fusion of the virus with the host cell membrane, VP40 is also responsible for viral particle assembly and budding. This study aimed at developing machine learning models to predict small molecules as possible anti-Ebola virus compounds capable of inhibiting the activities of GP and VP40 using Ebola virus (EBOV) cell entry inhibitors from the PubChem database as training data. Predictive models were developed using five algorithms comprising random forest (RF), support vector machine (SVM), naïve Bayes (NB), k-nearest neighbor (kNN), and logistic regression (LR). The models were evaluated using a 10-fold cross-validation technique and the algorithm with the best performance was the random forest model with an accuracy of 89 %, an F1 score of 0.9, and a receiver operating characteristic curve (ROC curve) showing the area under the curve (AUC) score of 0.95. LR and SVM models also showed plausible performances with overall accuracy values of 0.84 and 0.86, respectively. The models, RF, LR, and SVM were deployed as a web server known as EBOLApred accessible via http://197.255.126.13:8000/ . Graphical Abstract: ga1 Highlights: Developed five robust machine learning algorithms toAbstract: Ebola virus disease (EVD) is a highly virulent and often lethal illness that affects humans through contact with the body fluid of infected persons. Glycoprotein and matrix protein VP40 play essential roles in the virus life cycle within the host. Whilst glycoprotein mediates the entry and fusion of the virus with the host cell membrane, VP40 is also responsible for viral particle assembly and budding. This study aimed at developing machine learning models to predict small molecules as possible anti-Ebola virus compounds capable of inhibiting the activities of GP and VP40 using Ebola virus (EBOV) cell entry inhibitors from the PubChem database as training data. Predictive models were developed using five algorithms comprising random forest (RF), support vector machine (SVM), naïve Bayes (NB), k-nearest neighbor (kNN), and logistic regression (LR). The models were evaluated using a 10-fold cross-validation technique and the algorithm with the best performance was the random forest model with an accuracy of 89 %, an F1 score of 0.9, and a receiver operating characteristic curve (ROC curve) showing the area under the curve (AUC) score of 0.95. LR and SVM models also showed plausible performances with overall accuracy values of 0.84 and 0.86, respectively. The models, RF, LR, and SVM were deployed as a web server known as EBOLApred accessible via http://197.255.126.13:8000/ . Graphical Abstract: ga1 Highlights: Developed five robust machine learning algorithms to predict Ebola virus cell entry inhibitors. Random forest emerged as the most efficient for predicting anti-Ebola compounds. The novel machine learning models have been deployed as a webserver known as EBOLApred. Algorithms developed augments the search for novel therapeutic agents for Ebola virus disease. … (more)
- Is Part Of:
- Computational biology and chemistry. Volume 101(2022)
- Journal:
- Computational biology and chemistry
- Issue:
- Volume 101(2022)
- Issue Display:
- Volume 101, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 101
- Issue:
- 2022
- Issue Sort Value:
- 2022-0101-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-12
- Subjects:
- EVD Ebola virus disease -- EBOV Ebola virus -- GP glycoprotein -- VP40 Ebola virus protein 40 -- AID assay Identifier -- ML machine Learning -- RF random forest -- SVM support vector machine -- NB Naïve bayes -- kNN k-nearest neighbor -- LR logistic regression -- RFE recursive feature elimination -- TP true positives -- FP false positives -- TN true negatives -- FN false negatives -- ROC receiver operating characteristic -- AUC area under the curve -- AD applicability domain
Ebola virus protein -- Machine learning -- Inhibitors -- Support vector machine -- Random forest -- Logistic regression
Chemistry -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
Biochemistry -- Data processing
Biology -- Data processing
Molecular biology -- Data processing
Periodicals
Electronic journals
542.85 - Journal URLs:
- http://www.sciencedirect.com/science/journal/14769271 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiolchem.2022.107766 ↗
- Languages:
- English
- ISSNs:
- 1476-9271
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.576700
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 24382.xml