How much metagenome data is needed for protein structure prediction: The advantages of targeted approach from the ecological and evolutionary perspectives. Issue 1 (6th March 2022)
- Record Type:
- Journal Article
- Title:
- How much metagenome data is needed for protein structure prediction: The advantages of targeted approach from the ecological and evolutionary perspectives. Issue 1 (6th March 2022)
- Main Title:
- How much metagenome data is needed for protein structure prediction: The advantages of targeted approach from the ecological and evolutionary perspectives
- Authors:
- Yang, Pengshuo
Ning, Kang - Abstract:
- Abstract: It has been proven that three‐dimensional protein structures could be modeled by supplementing homologous sequences with metagenome sequences. Even though a large volume of metagenome data is utilized for such purposes, a significant proportion of proteins remain unsolved. In this review, we focus on identifying ecological and evolutionary patterns in metagenome data, decoding the complicated relationships of these patterns with protein structures, and investigating how these patterns can be effectively used to improve protein structure prediction. First, we proposed the metagenome utilization efficiency and marginal effect model to quantify the divergent distribution of homologous sequences for the protein family. Second, we proposed that the targeted approach effectively identifies homologous sequences from specified biomes compared with the untargeted approach's blind search. Finally, we determined the lower bound for metagenome data required for predicting all the protein structures in the Pfam database and showed that the present metagenome data is insufficient for this purpose. In summary, we discovered ecological and evolutionary patterns in the metagenome data that may be used to predict protein structures effectively. The targeted approach is promising in terms of effectively extracting homologous sequences and predicting protein structures using these patterns. Abstract : For protein 3D structure prediction, we mine the data‐dependent ecological andAbstract: It has been proven that three‐dimensional protein structures could be modeled by supplementing homologous sequences with metagenome sequences. Even though a large volume of metagenome data is utilized for such purposes, a significant proportion of proteins remain unsolved. In this review, we focus on identifying ecological and evolutionary patterns in metagenome data, decoding the complicated relationships of these patterns with protein structures, and investigating how these patterns can be effectively used to improve protein structure prediction. First, we proposed the metagenome utilization efficiency and marginal effect model to quantify the divergent distribution of homologous sequences for the protein family. Second, we proposed that the targeted approach effectively identifies homologous sequences from specified biomes compared with the untargeted approach's blind search. Finally, we determined the lower bound for metagenome data required for predicting all the protein structures in the Pfam database and showed that the present metagenome data is insufficient for this purpose. In summary, we discovered ecological and evolutionary patterns in the metagenome data that may be used to predict protein structures effectively. The targeted approach is promising in terms of effectively extracting homologous sequences and predicting protein structures using these patterns. Abstract : For protein 3D structure prediction, we mine the data‐dependent ecological and evolutionary trends hidden in metagenome data. Based on this pattern, the targeted approach was presented to predict the protein 3D structure more effectively and accurately than the untargeted approach's blind search. Highlights: Metagenome benefits for homologous sequence supplement for protein three‐dimensional (3D) structure prediction. Metagenome utilization efficiency shows a divergent distribution of proteins. Marginal effect model also quantifies this divergent distribution of proteins. For mining homologous sequences, the targeted approach outperforms the untargeted approach. Current metagenome data is not enough for modeling 3D structures for all proteins. … (more)
- Is Part Of:
- IMeta. Volume 1:Issue 1(2022)
- Journal:
- IMeta
- Issue:
- Volume 1:Issue 1(2022)
- Issue Display:
- Volume 1, Issue 1 (2022)
- Year:
- 2022
- Volume:
- 1
- Issue:
- 1
- Issue Sort Value:
- 2022-0001-0001-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2022-03-06
- Subjects:
- ecology -- evolution -- metagenome data -- protein 3D structure modeling -- targeted approach
Metagenomics -- Periodicals
Bioinformatics -- Periodicals
Bioinformatics
Metagenomics
Metagenomics
Metagenome
Computational Biology
Periodicals
Periodical
576.5 - Journal URLs:
- https://onlinelibrary.wiley.com/loi/2770596x ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1002/imt2.9 ↗
- Languages:
- English
- ISSNs:
- 2770-596X
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 21706.xml