SE-BLTCNN: A channel attention adapted deep learning model based on PSSM for membrane protein classification. (June 2022)
- Record Type:
- Journal Article
- Title:
- SE-BLTCNN: A channel attention adapted deep learning model based on PSSM for membrane protein classification. (June 2022)
- Main Title:
- SE-BLTCNN: A channel attention adapted deep learning model based on PSSM for membrane protein classification
- Authors:
- He, Yu
Wang, Shunfang - Abstract:
- Abstract: Membrane protein classification is a key to inferring the function of uncharacterized membrane protein. To get around the time-consuming and expensive biochemical experiments in the wet lab, there has been a lot of research focusing on developing fast and reliable bioinformatics or computer modeling methods for membrane protein prediction. However, most research is inclined to incorporate as many types of protein data as possible, yet in many cases, the number of accessible protein data types is quite limited. To solve this challenge, a channel attention adapted deep learning model that takes the position-specific scoring matrix (PSSM) as only input and its simplified version without channel attention have been developed. They are named SE-BLTCNN and BLTCNN, respectively (the abbreviations for "SE embedded BiLSTM-TextCNN" and "plain BiLSTM-TextCNN"). The basic ideais to embed the Squeeze-and-Excitation (SE) block into the architecture of the text convolutional neural network (textCNN) and combine the resulting architecture with the bidirectional long short-term memory (BiLSTM) layer. An ablation experiment is also conducted to verify the effectiveness of using BiLSTM to extract high-level features from PSSM. On the benchmark sample set, the BLTCNN can achieve average precision as high as 96.2% and turns out to be state-of-the-art membrane protein type predictors based solely on PSSM data to the best of our knowledge; the SE-BLTCNN is second-best among allAbstract: Membrane protein classification is a key to inferring the function of uncharacterized membrane protein. To get around the time-consuming and expensive biochemical experiments in the wet lab, there has been a lot of research focusing on developing fast and reliable bioinformatics or computer modeling methods for membrane protein prediction. However, most research is inclined to incorporate as many types of protein data as possible, yet in many cases, the number of accessible protein data types is quite limited. To solve this challenge, a channel attention adapted deep learning model that takes the position-specific scoring matrix (PSSM) as only input and its simplified version without channel attention have been developed. They are named SE-BLTCNN and BLTCNN, respectively (the abbreviations for "SE embedded BiLSTM-TextCNN" and "plain BiLSTM-TextCNN"). The basic ideais to embed the Squeeze-and-Excitation (SE) block into the architecture of the text convolutional neural network (textCNN) and combine the resulting architecture with the bidirectional long short-term memory (BiLSTM) layer. An ablation experiment is also conducted to verify the effectiveness of using BiLSTM to extract high-level features from PSSM. On the benchmark sample set, the BLTCNN can achieve average precision as high as 96.2% and turns out to be state-of-the-art membrane protein type predictors based solely on PSSM data to the best of our knowledge; the SE-BLTCNN is second-best among all comparison methods, at a bit lower average precision of 95.7%. In addition, through empirical research, the excessive zero-padding in the training examples has been pinpointed as the major cause of performance loss of the SE-LTCNN; and by using the adjusted sample set, it has been confirmed that the SE-BLTCNN model outperforms the BLTCNN once this major cause is suppressed. The core code and dataset are available at https://github.com/Raymond-2017/membrane-protein-classifiction . Graphical Abstract: ga1 Highlights: Only take PSSM as the input data but achieve a high success rate. Use attention mechanism to model the interdependencies between the channels. Subtly integrate the BiLSTM layer and the attention adjusted textCNN. Excessive zero-padding in the training examples is pinpointed as the obstacle to the SE block optimization. Verify the effectiveness of the features extracted from PSSM using BiLSTM. … (more)
- Is Part Of:
- Computational biology and chemistry. Volume 98(2022)
- Journal:
- Computational biology and chemistry
- Issue:
- Volume 98(2022)
- Issue Display:
- Volume 98, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 98
- Issue:
- 2022
- Issue Sort Value:
- 2022-0098-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-06
- Subjects:
- Membrane protein classification -- Channel attention -- PSSM -- TextCNN -- BiLSTM
Chemistry -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
Biochemistry -- Data processing
Biology -- Data processing
Molecular biology -- Data processing
Periodicals
Electronic journals
542.85 - Journal URLs:
- http://www.sciencedirect.com/science/journal/14769271 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiolchem.2022.107680 ↗
- Languages:
- English
- ISSNs:
- 1476-9271
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.576700
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 21569.xml