COMSAT: Residue contact prediction of transmembrane proteins based on support vector machines and mixed integer linear programming. Issue 3 (20th January 2016)
- Record Type:
- Journal Article
- Title:
- COMSAT: Residue contact prediction of transmembrane proteins based on support vector machines and mixed integer linear programming. Issue 3 (20th January 2016)
- Main Title:
- COMSAT: Residue contact prediction of transmembrane proteins based on support vector machines and mixed integer linear programming
- Authors:
- Zhang, Huiling
Huang, Qingsheng
Bei, Zhendong
Wei, Yanjie
Floudas, Christodoulos A. - Abstract:
- ABSTRACT: In this article, we present COMSAT, a hybrid framework for residue contact prediction of transmembrane (TM) proteins, integrating a support vector machine (SVM) method and a mixed integer linear programming (MILP) method. COMSAT consists of two modules: COMSAT_SVM which is trained mainly on position–specific scoring matrix features, and COMSAT_MILP which is an ab initio method based on optimization models. Contacts predicted by the SVM model are ranked by SVM confidence scores, and a threshold is trained to improve the reliability of the predicted contacts. For TM proteins with no contacts above the threshold, COMSAT_MILP is used. The proposed hybrid contact prediction scheme was tested on two independent TM protein sets based on the contact definition of 14 Å between Cα‐Cα atoms. First, using a rigorous leave‐one‐protein‐out cross validation on the training set of 90 TM proteins, an accuracy of 66.8%, a coverage of 12.3%, a specificity of 99.3% and a Matthews' correlation coefficient (MCC) of 0.184 were obtained for residue pairs that are at least six amino acids apart. Second, when tested on a test set of 87 TM proteins, the proposed method showed a prediction accuracy of 64.5%, a coverage of 5.3%, a specificity of 99.4% and a MCC of 0.106. COMSAT shows satisfactory results when compared with 12 other state‐of‐the‐art predictors, and is more robust in terms of prediction accuracy as the length and complexity of TM protein increase. COMSAT is freely accessibleABSTRACT: In this article, we present COMSAT, a hybrid framework for residue contact prediction of transmembrane (TM) proteins, integrating a support vector machine (SVM) method and a mixed integer linear programming (MILP) method. COMSAT consists of two modules: COMSAT_SVM which is trained mainly on position–specific scoring matrix features, and COMSAT_MILP which is an ab initio method based on optimization models. Contacts predicted by the SVM model are ranked by SVM confidence scores, and a threshold is trained to improve the reliability of the predicted contacts. For TM proteins with no contacts above the threshold, COMSAT_MILP is used. The proposed hybrid contact prediction scheme was tested on two independent TM protein sets based on the contact definition of 14 Å between Cα‐Cα atoms. First, using a rigorous leave‐one‐protein‐out cross validation on the training set of 90 TM proteins, an accuracy of 66.8%, a coverage of 12.3%, a specificity of 99.3% and a Matthews' correlation coefficient (MCC) of 0.184 were obtained for residue pairs that are at least six amino acids apart. Second, when tested on a test set of 87 TM proteins, the proposed method showed a prediction accuracy of 64.5%, a coverage of 5.3%, a specificity of 99.4% and a MCC of 0.106. COMSAT shows satisfactory results when compared with 12 other state‐of‐the‐art predictors, and is more robust in terms of prediction accuracy as the length and complexity of TM protein increase. COMSAT is freely accessible athttp://hpcc.siat.ac.cn/COMSAT/ . Proteins 2016; 84:332–348. © 2016 Wiley Periodicals, Inc. … (more)
- Is Part Of:
- Proteins. Volume 84:Issue 3(2016)
- Journal:
- Proteins
- Issue:
- Volume 84:Issue 3(2016)
- Issue Display:
- Volume 84, Issue 3 (2016)
- Year:
- 2016
- Volume:
- 84
- Issue:
- 3
- Issue Sort Value:
- 2016-0084-0003-0000
- Page Start:
- 332
- Page End:
- 348
- Publication Date:
- 2016-01-20
- Subjects:
- protein structure prediction -- hybrid framework -- machine learning -- ab initio prediction -- MILP
Proteins -- Periodicals
Proteins -- Periodicals
572.6 - Journal URLs:
- http://onlinelibrary.wiley.com/ ↗
- DOI:
- 10.1002/prot.24979 ↗
- Languages:
- English
- ISSNs:
- 0887-3585
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 6936.164000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 1113.xml