Computational Technique for an Efficient Classification of Protein Sequences With Distance‐Based Sequence Encoding Algorithm. (21st September 2015)
- Record Type:
- Journal Article
- Title:
- Computational Technique for an Efficient Classification of Protein Sequences With Distance‐Based Sequence Encoding Algorithm. (21st September 2015)
- Main Title:
- Computational Technique for an Efficient Classification of Protein Sequences With Distance‐Based Sequence Encoding Algorithm
- Authors:
- Iqbal, Muhammad Javed
Faye, Ibrahima
Said, Abas MD
Samir, Brahim Belhaouari - Abstract:
- Abstract : Machine learning is being implemented in bioinformatics and computational biology to solve challenging problems emerged in the analysis and modeling of biological data such as DNA, RNA, and protein. The major problems in classifying protein sequences into existing families/superfamilies are the following: the selection of a suitable sequence encoding method, the extraction of an optimized subset of features that possesses significant discriminatory information, and the adaptation of an appropriate learning algorithm that classifies protein sequences with higher classification accuracy. The accurate classification of protein sequence would be helpful in determining the structure and function of novel protein sequences. In this article, we have proposed a distance‐based sequence encoding algorithm that captures the sequence's statistical characteristics along with amino acids sequence order information. A statistical metric‐based feature selection algorithm is then adopted to identify the reduced set of features to represent the original feature space. The performance of the proposed technique is validated using some of the best performing classifiers implemented previously for protein sequence classification. An average classification accuracy of 92% was achieved on the yeast protein sequence data set downloaded from the benchmark UniProtKB database.
- Is Part Of:
- Computational intelligence. Volume 33:Number 1(2017)
- Journal:
- Computational intelligence
- Issue:
- Volume 33:Number 1(2017)
- Issue Display:
- Volume 33, Issue 1 (2017)
- Year:
- 2017
- Volume:
- 33
- Issue:
- 1
- Issue Sort Value:
- 2017-0033-0001-0000
- Page Start:
- 32
- Page End:
- 55
- Publication Date:
- 2015-09-21
- Subjects:
- sequence encoding, biological data mining, feature selection, superfamily, protein classification algorithm
Artificial intelligence -- Periodicals
Computational linguistics -- Periodicals
006.3 - Journal URLs:
- http://www.blackwellpublishing.com/journal.asp?ref=0824-7935&site=1 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1111/coin.12069 ↗
- Languages:
- English
- ISSNs:
- 0824-7935
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.595000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 51.xml