PPred-PCKSM: A multi-layer predictor for identifying promoter and its variants using position based features. (April 2022)
- Record Type:
- Journal Article
- Title:
- PPred-PCKSM: A multi-layer predictor for identifying promoter and its variants using position based features. (April 2022)
- Main Title:
- PPred-PCKSM: A multi-layer predictor for identifying promoter and its variants using position based features
- Authors:
- Bhukya, Raju
Kumari, Archana
Amilpur, Santhosh
Dasari, Chandra Mohan - Abstract:
- Abstract: Promoter is a small region of DNA where a protein called RNA polymerase binds thus resulting in initiation of transcription of a specific gene. In bacteria with prokaryotic cell type, the sigma subunit that combines with RNA polymerase helps in identifying promoters. In Escherichia coli (E.coli), the promoters are identified by different sigma factors consisting of different functionalities. There have been various methods used for prediction of different class of promoters. However, these methods need to be improved for better identification and classification of promoters. In this work, we propose a new multi-layer predictor named PPred-PCKSM that uses position-correlation based k-mer scoring matrix (PCKSM), a new feature extraction strategy and an artificial neural network (ANN) for predicting promoters and its six types, namely σ 70, σ 24, σ 28, σ 32, σ 38 and σ 54 in E.coli bacteria. We employ PCKSM technique to extract feature sets from different k-mers. The feature sets obtained from trimers and tetramers are concatenated and then passed through ANN for final prediction. The resultant feature set contained effective features that contributed towards achieving an accuracy of 98.02% and Matthews correlation coefficient (MCC) of 96.04% for promoter prediction task. Our model used 5-fold cross validation on the benchmark dataset and outperformed all the current state-of-art-methods used for prediction of promoters and its different types in E.coli bacteria.Abstract: Promoter is a small region of DNA where a protein called RNA polymerase binds thus resulting in initiation of transcription of a specific gene. In bacteria with prokaryotic cell type, the sigma subunit that combines with RNA polymerase helps in identifying promoters. In Escherichia coli (E.coli), the promoters are identified by different sigma factors consisting of different functionalities. There have been various methods used for prediction of different class of promoters. However, these methods need to be improved for better identification and classification of promoters. In this work, we propose a new multi-layer predictor named PPred-PCKSM that uses position-correlation based k-mer scoring matrix (PCKSM), a new feature extraction strategy and an artificial neural network (ANN) for predicting promoters and its six types, namely σ 70, σ 24, σ 28, σ 32, σ 38 and σ 54 in E.coli bacteria. We employ PCKSM technique to extract feature sets from different k-mers. The feature sets obtained from trimers and tetramers are concatenated and then passed through ANN for final prediction. The resultant feature set contained effective features that contributed towards achieving an accuracy of 98.02% and Matthews correlation coefficient (MCC) of 96.04% for promoter prediction task. Our model used 5-fold cross validation on the benchmark dataset and outperformed all the current state-of-art-methods used for prediction of promoters and its different types in E.coli bacteria. Graphical Abstract: ga1 Highlights: Proposed two layer based model for prediction of promoters and their variants. An effective feature extraction technique PCKSM is used from different k-mers. The concatenation of trimer and tetramer features yields more effective features. PPred-PCKSM outperforms all the state-of-the-art models trained on E. coli dataset. The model achieved an accuracy of 98.02, MCC of 96.04 for promoter prediction task. … (more)
- Is Part Of:
- Computational biology and chemistry. Volume 97(2022)
- Journal:
- Computational biology and chemistry
- Issue:
- Volume 97(2022)
- Issue Display:
- Volume 97, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 97
- Issue:
- 2022
- Issue Sort Value:
- 2022-0097-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-04
- Subjects:
- Promoters -- Sigma factors -- Artificial neural network -- Position-correlation based k-mer scoring matrix (PCKSM) -- Machine learning -- Gene regulation
Chemistry -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
Biochemistry -- Data processing
Biology -- Data processing
Molecular biology -- Data processing
Periodicals
Electronic journals
542.85 - Journal URLs:
- http://www.sciencedirect.com/science/journal/14769271 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiolchem.2022.107623 ↗
- Languages:
- English
- ISSNs:
- 1476-9271
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.576700
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 20979.xml