Facilitating Machine Learning‐Guided Protein Engineering with Smart Library Design and Massively Parallel Assays. Issue 4 (7th December 2021)
- Record Type:
- Journal Article
- Title:
- Facilitating Machine Learning‐Guided Protein Engineering with Smart Library Design and Massively Parallel Assays. Issue 4 (7th December 2021)
- Main Title:
- Facilitating Machine Learning‐Guided Protein Engineering with Smart Library Design and Massively Parallel Assays
- Authors:
- Chu, Hoi Yee
Wong, Alan S. L. - Abstract:
- Abstract: Protein design plays an important role in recent medical advances from antibody therapy to vaccine design. Typically, exhaustive mutational screens or directed evolution experiments are used for the identification of the best design or for improvements to the wild‐type variant. Even with a high‐throughput screening on pooled libraries and Next‐Generation Sequencing to boost the scale of read‐outs, surveying all the variants with combinatorial mutations for their empirical fitness scores is still of magnitudes beyond the capacity of existing experimental settings. To tackle this challenge, in‐silico approaches using machine learning to predict the fitness of novel variants based on a subset of empirical measurements are now employed. These machine learning models turn out to be useful in many cases, with the premise that the experimentally determined fitness scores and the amino‐acid descriptors of the models are informative. The machine learning models can guide the search for the highest fitness variants, resolve complex epistatic relationships, and highlight bio‐physical rules for protein folding. Using machine learning‐guided approaches, researchers can build more focused libraries, thus relieving themselves from labor‐intensive screens and fast‐tracking the optimization process. Here, we describe the current advances in massive‐scale variant screens, and how machine learning and mutagenesis strategies can be integrated to accelerate protein engineering. MoreAbstract: Protein design plays an important role in recent medical advances from antibody therapy to vaccine design. Typically, exhaustive mutational screens or directed evolution experiments are used for the identification of the best design or for improvements to the wild‐type variant. Even with a high‐throughput screening on pooled libraries and Next‐Generation Sequencing to boost the scale of read‐outs, surveying all the variants with combinatorial mutations for their empirical fitness scores is still of magnitudes beyond the capacity of existing experimental settings. To tackle this challenge, in‐silico approaches using machine learning to predict the fitness of novel variants based on a subset of empirical measurements are now employed. These machine learning models turn out to be useful in many cases, with the premise that the experimentally determined fitness scores and the amino‐acid descriptors of the models are informative. The machine learning models can guide the search for the highest fitness variants, resolve complex epistatic relationships, and highlight bio‐physical rules for protein folding. Using machine learning‐guided approaches, researchers can build more focused libraries, thus relieving themselves from labor‐intensive screens and fast‐tracking the optimization process. Here, we describe the current advances in massive‐scale variant screens, and how machine learning and mutagenesis strategies can be integrated to accelerate protein engineering. More specifically, we examine strategies to make screens more economical, informative, and effective in discovery of useful variants. Abstract : Integrating protein design and variant screening with machine learning facilitates the discovery of useful variants for applications including antibody therapy and vaccine generation. This perspective describes how the prioritization of variant testing maximizes the power of machine learning for protein engineering, strategies for designing a smart screening library, and considerations for integrating massively parallel assays with machine learning. … (more)
- Is Part Of:
- Advanced genetics. Volume 2:Issue 4(2021)
- Journal:
- Advanced genetics
- Issue:
- Volume 2:Issue 4(2021)
- Issue Display:
- Volume 2, Issue 4 (2021)
- Year:
- 2021
- Volume:
- 2
- Issue:
- 4
- Issue Sort Value:
- 2021-0002-0004-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2021-12-07
- Subjects:
- combinatorial mutagenesis -- machine learning -- protein engineering -- infologs -- functionality prediction
Genetics -- Periodicals
Genomics -- Periodicals
Genomics
Genetics
Genetics
Genomics
Electronic journals
Periodicals
576.5 - Journal URLs:
- https://onlinelibrary.wiley.com/toc/26416573/2020/1/1 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1002/ggn2.202100038 ↗
- Languages:
- English
- ISSNs:
- 2641-6573
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20313.xml