Cytokine gene variants and socio-demographic characteristics as predictors of cervical cancer: A machine learning approach. (July 2021)
- Record Type:
- Journal Article
- Title:
- Cytokine gene variants and socio-demographic characteristics as predictors of cervical cancer: A machine learning approach. (July 2021)
- Main Title:
- Cytokine gene variants and socio-demographic characteristics as predictors of cervical cancer: A machine learning approach
- Authors:
- Kaushik, Manoj
Chandra Joshi, Rakesh
Kushwah, Atar Singh
Gupta, Maneesh Kumar
Banerjee, Monisha
Burget, Radim
Dutta, Malay Kishore - Abstract:
- Abstract: Cervical cancer is still one of the most prevalent cancers in women and a significant cause of mortality. Cytokine gene variants and socio-demographic characteristics have been reported as biomarkers for determining the cervical cancer risk in the Indian population. This study was designed to apply a machine learning-based model using these risk factors for better prognosis and prediction of cervical cancer. This study includes the dataset of cytokine gene variants, clinical and socio-demographic characteristics of normal healthy control subjects, and cervical cancer cases. Different risk factors, including demographic details and cytokine gene variants, were analysed using different machine learning approaches. Various statistical parameters were used for evaluating the proposed method. After multi-step data processing and random splitting of the dataset, machine learning methods were applied and evaluated with 5-fold cross-validation and also tested on the unseen data records of a collected dataset for proper evaluation and analysis. The proposed approaches were verified after analysing various performance metrics. The logistic regression technique achieved the highest average accuracy of 82.25% and the highest average F1-score of 82.58% among all the methods. Ridge classifiers and the Gaussian Naïve Bayes classifier achieved the highest sensitivity—85%. The ridge classifier surpasses most of the machine learning classifiers with 84.78% accuracy and 97.83%Abstract: Cervical cancer is still one of the most prevalent cancers in women and a significant cause of mortality. Cytokine gene variants and socio-demographic characteristics have been reported as biomarkers for determining the cervical cancer risk in the Indian population. This study was designed to apply a machine learning-based model using these risk factors for better prognosis and prediction of cervical cancer. This study includes the dataset of cytokine gene variants, clinical and socio-demographic characteristics of normal healthy control subjects, and cervical cancer cases. Different risk factors, including demographic details and cytokine gene variants, were analysed using different machine learning approaches. Various statistical parameters were used for evaluating the proposed method. After multi-step data processing and random splitting of the dataset, machine learning methods were applied and evaluated with 5-fold cross-validation and also tested on the unseen data records of a collected dataset for proper evaluation and analysis. The proposed approaches were verified after analysing various performance metrics. The logistic regression technique achieved the highest average accuracy of 82.25% and the highest average F1-score of 82.58% among all the methods. Ridge classifiers and the Gaussian Naïve Bayes classifier achieved the highest sensitivity—85%. The ridge classifier surpasses most of the machine learning classifiers with 84.78% accuracy and 97.83% sensitivity. The risk factors analysed in this study can be taken as biomarkers in developing a cervical cancer diagnosis system. The outcomes demonstrate that the machine learning assisted analysis of cytokine gene variants and socio-demographic characteristics can be utilised effectively for predicting the risk of developing cervical cancer. Graphical abstract: Image 1 Highlights: A machine learning-based approach is proposed in this paper for the risk prediction of cervical cancer. Different risk factors adopting demographic details and cytokine genes were analysed on collected dataset. Various statistical parameters are used for the evaluation on different machine-learning approaches. The proposed approach performs well on 5-fold cross-validation and testing in unseen data records. The risk factor analysed in this study can be taken as a biomarker in developing a cervical cancer diagnosis system. … (more)
- Is Part Of:
- Computers in biology and medicine. Volume 134(2021)
- Journal:
- Computers in biology and medicine
- Issue:
- Volume 134(2021)
- Issue Display:
- Volume 134, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 134
- Issue:
- 2021
- Issue Sort Value:
- 2021-0134-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-07
- Subjects:
- Artificial intelligence -- Bioinformatics -- Cervical cancer -- Computational biology -- Cytokine gene polymorphisms -- Machine learning
Medicine -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
610.285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00104825/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiomed.2021.104559 ↗
- Languages:
- English
- ISSNs:
- 0010-4825
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.880000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 17435.xml