Prediction of G-protein coupled receptors and their subfamilies by incorporating various sequence features into Chou's general PseAAC. Issue 134 (October 2016)
- Record Type:
- Journal Article
- Title:
- Prediction of G-protein coupled receptors and their subfamilies by incorporating various sequence features into Chou's general PseAAC. Issue 134 (October 2016)
- Main Title:
- Prediction of G-protein coupled receptors and their subfamilies by incorporating various sequence features into Chou's general PseAAC
- Authors:
- Tiwari, Arvind Kumar
- Abstract:
- Highlights: In this paper the protein samples are represented by eight sequence derived properties. In this paper the amino acid composition, dipeptide composition, correlation features, composition, transition, distribution, sequence order descriptors and pseudo amino acid composition with total 1497 number of sequence derived features are used. Here a weighted k –nearest neighbor classifier are proposed to use with UNION of best 50 features selected by Fisher score based feature selection, ReliefF, FCBF, MRMR and SVM-RFE feature selection methods to exploit the advantages of these feature selection methods. Abstract: Background and objective: The G-protein coupled receptors are the largest superfamilies of membrane proteins and important targets for the drug design. G-protein coupled receptors are responsible for many physiochemical processes such as smell, taste, vision, neurotransmission, metabolism, cellular growth and immune response. So it is necessary to design a robust and efficient approach for the prediction of G-protein coupled receptors and their subfamilies. Methods: In this paper, the protein samples are represented by amino acid composition, dipeptide composition, correlation features, composition, transition, distribution, sequence order descriptors and pseudo amino acid composition with total 1497 number of sequence derived features. To address the issue of efficient classification of G-protein coupled receptors and their subfamilies, we propose to use aHighlights: In this paper the protein samples are represented by eight sequence derived properties. In this paper the amino acid composition, dipeptide composition, correlation features, composition, transition, distribution, sequence order descriptors and pseudo amino acid composition with total 1497 number of sequence derived features are used. Here a weighted k –nearest neighbor classifier are proposed to use with UNION of best 50 features selected by Fisher score based feature selection, ReliefF, FCBF, MRMR and SVM-RFE feature selection methods to exploit the advantages of these feature selection methods. Abstract: Background and objective: The G-protein coupled receptors are the largest superfamilies of membrane proteins and important targets for the drug design. G-protein coupled receptors are responsible for many physiochemical processes such as smell, taste, vision, neurotransmission, metabolism, cellular growth and immune response. So it is necessary to design a robust and efficient approach for the prediction of G-protein coupled receptors and their subfamilies. Methods: In this paper, the protein samples are represented by amino acid composition, dipeptide composition, correlation features, composition, transition, distribution, sequence order descriptors and pseudo amino acid composition with total 1497 number of sequence derived features. To address the issue of efficient classification of G-protein coupled receptors and their subfamilies, we propose to use a weighted k-nearest neighbor classifier with UNION of best 50 features, selected by Fisher score based feature selection, ReliefF, fast correlation based filter, minimum redundancy maximum relevancy, and support vector machine based recursive elimination feature selection methods to exploit the advantages of these feature selection methods. Results: The proposed method achieved an overall accuracy of 99.9%, 98.3%, 95.4%, MCC values of 1.00, 0.98, 0.95, ROC area values of 1.00, 0.998, 0.996 and precision of 99.9%, 98.3% and 95.5% using 10-fold cross-validation to predict the G-protein coupled receptors and non-G-protein coupled receptors, subfamilies of G-protein coupled receptors, and subfamilies of class A G-protein coupled receptors, respectively. Conclusions: The high accuracies, MCC, ROC area values, and precision values indicate that the proposed method is better for the prediction of G-protein coupled receptors families and their subfamilies. … (more)
- Is Part Of:
- Computer methods and programs in biomedicine. Issue 134(2016)
- Journal:
- Computer methods and programs in biomedicine
- Issue:
- Issue 134(2016)
- Issue Display:
- Volume 134, Issue 134 (2016)
- Year:
- 2016
- Volume:
- 134
- Issue:
- 134
- Issue Sort Value:
- 2016-0134-0134-0000
- Page Start:
- 197
- Page End:
- 213
- Publication Date:
- 2016-10
- Subjects:
- G-protein coupled receptors -- Weighted k-nearest neighbor -- Minimum redundancy maximum relevance -- Sequence derived properties -- Matthew's correlation coefficient
Medicine -- Computer programs -- Periodicals
Biology -- Computer programs -- Periodicals
Computers -- Periodicals
Medicine -- Periodicals
Médecine -- Logiciels -- Périodiques
Biologie -- Logiciels -- Périodiques
Biology -- Computer programs
Medicine -- Computer programs
Periodicals
Electronic journals
610.28 - Journal URLs:
- http://www.sciencedirect.com/science/journal/01692607 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.cmpb.2016.07.004 ↗
- Languages:
- English
- ISSNs:
- 0169-2607
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.095000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 406.xml