Discovering protein-binding RNA motifs with a generative model of RNA sequences. (February 2020)
- Record Type:
- Journal Article
- Title:
- Discovering protein-binding RNA motifs with a generative model of RNA sequences. (February 2020)
- Main Title:
- Discovering protein-binding RNA motifs with a generative model of RNA sequences
- Authors:
- Park, Byungkyu
Han, Kyungsook - Abstract:
- Graphical abstract: A generative model using a LSTM neural network for constructing protein-binding RNA sequences and for finding protein-binding motifs. Highlights: LSTM-based model for constructing protein-binding RNA sequences. Computationally deriving protein-binding motifs from generated RNA sequences. The derived motifs are comparable to experimentally validated motifs. Abstract: Recent advances in high-throughput experimental technologies have generated a huge amount of data on interactions between proteins and nucleic acids. Motivated by the big experimental data, several computational methods have been developed either to predict binding sites in a sequence or to determine if an interaction exists between protein and nucleic acid sequences. However, most of the methods cannot be used to discover new nucleic acid sequences that bind to a target protein because they are classifiers rather than generators. In this paper we propose a generative model for constructing protein-binding RNA sequences and motifs using a long short-term memory (LSTM) neural network. Testing the model for several target proteins showed that RNA sequences generated by the model have high binding affinity and specificity for their target proteins and that the protein-binding motifs derived from the generated RNA sequences are comparable to the motifs from experimentally validated protein-binding RNA sequences. The results are promising and we believe this approach will help design more efficientGraphical abstract: A generative model using a LSTM neural network for constructing protein-binding RNA sequences and for finding protein-binding motifs. Highlights: LSTM-based model for constructing protein-binding RNA sequences. Computationally deriving protein-binding motifs from generated RNA sequences. The derived motifs are comparable to experimentally validated motifs. Abstract: Recent advances in high-throughput experimental technologies have generated a huge amount of data on interactions between proteins and nucleic acids. Motivated by the big experimental data, several computational methods have been developed either to predict binding sites in a sequence or to determine if an interaction exists between protein and nucleic acid sequences. However, most of the methods cannot be used to discover new nucleic acid sequences that bind to a target protein because they are classifiers rather than generators. In this paper we propose a generative model for constructing protein-binding RNA sequences and motifs using a long short-term memory (LSTM) neural network. Testing the model for several target proteins showed that RNA sequences generated by the model have high binding affinity and specificity for their target proteins and that the protein-binding motifs derived from the generated RNA sequences are comparable to the motifs from experimentally validated protein-binding RNA sequences. The results are promising and we believe this approach will help design more efficient in vitro or in vivo experiments by suggesting potential RNA aptamers for a target protein. … (more)
- Is Part Of:
- Computational biology and chemistry. Volume 84(2020)
- Journal:
- Computational biology and chemistry
- Issue:
- Volume 84(2020)
- Issue Display:
- Volume 84, Issue 2020 (2020)
- Year:
- 2020
- Volume:
- 84
- Issue:
- 2020
- Issue Sort Value:
- 2020-0084-2020-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-02
- Subjects:
- Protein-RNA interaction -- Binding motif -- Generator -- Long short-term memory network
Chemistry -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
Biochemistry -- Data processing
Biology -- Data processing
Molecular biology -- Data processing
Periodicals
Electronic journals
542.85 - Journal URLs:
- http://www.sciencedirect.com/science/journal/14769271 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiolchem.2019.107171 ↗
- Languages:
- English
- ISSNs:
- 1476-9271
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.576700
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 12624.xml