Lung cancer prediction using multi-gene genetic programming by selecting automatic features from amino acid sequences. (June 2022)
- Record Type:
- Journal Article
- Title:
- Lung cancer prediction using multi-gene genetic programming by selecting automatic features from amino acid sequences. (June 2022)
- Main Title:
- Lung cancer prediction using multi-gene genetic programming by selecting automatic features from amino acid sequences
- Authors:
- Sattar, Mohsin
Majid, Abdul
Kausar, Nabeela
Bilal, Muhammad
Kashif, Muhammad - Abstract:
- Abstract: Lung cancer is one of the leading causes of cancer related deaths. Early diagnosis of lung cancer using automatic feature selection from large number of features is a challenging task. Conventionally, cancer diagnosis approaches use physical features that appear in later stages, while harmful effects have already been occurred due to abnormal somatic mutations. In order to extract useful novel patterns to efficiently predict cancer at early stages, we analyzed lung cancer related mutated genes that reveal useful information in protein amino acid sequences. For this, we developed a new evolutionary learning technique with biologically inspired multi-gene genetic programming algorithm using discriminant information of protein amino acids. The proposed model efficiently selects 23 discriminant features out of 1500 features. Then it combines the selected features and related primitive functions optimally for prediction of lung cancer. Hence, an efficient predictive model is constructed that helps in understanding the complex heterogeneous nature of lung cancer. The proposed system achieved area under ROC curve and accuracy values of 98.79% and 95.67%, respectively outperforming related lung cancer prediction approaches. Graphical Abstract: ga1 Highlights: Application of machine learning in health care for lung cancer prediction. Classification of cancer vs. non-cancer protein amino acid sequences using supervised learning. Transformation of Protein amino acid sequencesAbstract: Lung cancer is one of the leading causes of cancer related deaths. Early diagnosis of lung cancer using automatic feature selection from large number of features is a challenging task. Conventionally, cancer diagnosis approaches use physical features that appear in later stages, while harmful effects have already been occurred due to abnormal somatic mutations. In order to extract useful novel patterns to efficiently predict cancer at early stages, we analyzed lung cancer related mutated genes that reveal useful information in protein amino acid sequences. For this, we developed a new evolutionary learning technique with biologically inspired multi-gene genetic programming algorithm using discriminant information of protein amino acids. The proposed model efficiently selects 23 discriminant features out of 1500 features. Then it combines the selected features and related primitive functions optimally for prediction of lung cancer. Hence, an efficient predictive model is constructed that helps in understanding the complex heterogeneous nature of lung cancer. The proposed system achieved area under ROC curve and accuracy values of 98.79% and 95.67%, respectively outperforming related lung cancer prediction approaches. Graphical Abstract: ga1 Highlights: Application of machine learning in health care for lung cancer prediction. Classification of cancer vs. non-cancer protein amino acid sequences using supervised learning. Transformation of Protein amino acid sequences into various feature spaces. Automatic feature selection with bio-inspired multi-gene genetic programming. … (more)
- Is Part Of:
- Computational biology and chemistry. Volume 98(2022)
- Journal:
- Computational biology and chemistry
- Issue:
- Volume 98(2022)
- Issue Display:
- Volume 98, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 98
- Issue:
- 2022
- Issue Sort Value:
- 2022-0098-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-06
- Subjects:
- Lung cancer prediction -- Multi-gene genetic programming -- Protein amino acid sequence -- Gene mutation -- Somatic mutation
Chemistry -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
Biochemistry -- Data processing
Biology -- Data processing
Molecular biology -- Data processing
Periodicals
Electronic journals
542.85 - Journal URLs:
- http://www.sciencedirect.com/science/journal/14769271 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiolchem.2022.107638 ↗
- Languages:
- English
- ISSNs:
- 1476-9271
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.576700
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 21597.xml