A nine-gene signature identification and prognostic risk prediction for patients with lung adenocarcinoma using novel machine learning approach. (June 2022)
- Record Type:
- Journal Article
- Title:
- A nine-gene signature identification and prognostic risk prediction for patients with lung adenocarcinoma using novel machine learning approach. (June 2022)
- Main Title:
- A nine-gene signature identification and prognostic risk prediction for patients with lung adenocarcinoma using novel machine learning approach
- Authors:
- Dessie, Eskezeia Yihunie
Chang, Jan-Gowth
Chang, Ya-Sian - Abstract:
- Abstract: Background: Lung adenocarcinoma (LUAD) is one the most prevalent cancer with high mortality and its risk stratification is limited due lack of reliable molecular biomarkers. Although several studies have been conducted to identify gene signature involved in LUAD progression, most currently used methods to select gene features did not fully consider the problem of the existence of strong pairwise gene correlations as it resulted inconsistency in gene election. Therefore, it is crucial to develop new strategy to identify reliable gene signatures that improve risk prediction. Methods and results: In this study, novel feature selection strategy (1) univariate Cox regression model to select survival associated genes (2) integrating rigid Cox regression with Adaptive Lasso model to identify informative survival associated genes (3) stepwise Cox regression model to identify optimal gene signature and (4) prognostic risk predictive model for LUAD (PRPML) was constructed. The PRPML was developed-based on four machine learning (ML) methods including logistic regression (LR), K-nearest neighbors (KNN), support vector machine with the radial kernel (SVMR), and average neural network (Avnet). The PRPML model successfully stratified high-risk and low-risk groups of patients with LUAD in three datasets. The PRPML achieved an area under the curve (AUC) of 0.812 and 0.863 in the validation datasets. Finally, a nine-potential gene signature was found and showed great potential forAbstract: Background: Lung adenocarcinoma (LUAD) is one the most prevalent cancer with high mortality and its risk stratification is limited due lack of reliable molecular biomarkers. Although several studies have been conducted to identify gene signature involved in LUAD progression, most currently used methods to select gene features did not fully consider the problem of the existence of strong pairwise gene correlations as it resulted inconsistency in gene election. Therefore, it is crucial to develop new strategy to identify reliable gene signatures that improve risk prediction. Methods and results: In this study, novel feature selection strategy (1) univariate Cox regression model to select survival associated genes (2) integrating rigid Cox regression with Adaptive Lasso model to identify informative survival associated genes (3) stepwise Cox regression model to identify optimal gene signature and (4) prognostic risk predictive model for LUAD (PRPML) was constructed. The PRPML was developed-based on four machine learning (ML) methods including logistic regression (LR), K-nearest neighbors (KNN), support vector machine with the radial kernel (SVMR), and average neural network (Avnet). The PRPML model successfully stratified high-risk and low-risk groups of patients with LUAD in three datasets. The PRPML achieved an area under the curve (AUC) of 0.812 and 0.863 in the validation datasets. Finally, a nine-potential gene signature was found and showed great potential for risk prediction. Conclusions: Our study demonstrates that the developed strategy identified a nine potential gene signature for accurate risk prediction performance and this signature could provide valuable clue into the understanding of the molecular mechanism of LUAD disease. Highlights: A novel feature selection strategy is used to identify survival-related gene. The machine learning-based risk predictive model is proposed. A nine potential gene signature is selected by new feature selection strategy. The proposed risk model, PRPML shows better risk prediction performance. … (more)
- Is Part Of:
- Computers in biology and medicine. Volume 145(2022)
- Journal:
- Computers in biology and medicine
- Issue:
- Volume 145(2022)
- Issue Display:
- Volume 145, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 145
- Issue:
- 2022
- Issue Sort Value:
- 2022-0145-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-06
- Subjects:
- Integrated analysis -- Feature selection strategy -- Gene expression profile -- Risk stratification method
Medicine -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
610.285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00104825/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiomed.2022.105493 ↗
- Languages:
- English
- ISSNs:
- 0010-4825
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.880000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 21569.xml