Collation and data-mining of literature bioactivity data for drug discovery. (21st September 2011)
- Record Type:
- Journal Article
- Title:
- Collation and data-mining of literature bioactivity data for drug discovery. (21st September 2011)
- Main Title:
- Collation and data-mining of literature bioactivity data for drug discovery
- Authors:
- Bellis, Louisa J.
Akhtar, Ruth
Al-Lazikani, Bissan
Atkinson, Francis
Bento, A. Patricia
Chambers, Jon
Davies, Mark
Gaulton, Anna
Hersey, Anne
Ikeda, Kazuyoshi
Krüger, Felix A.
Light, Yvonne
McGlinchey, Shaun
Santos, Rita
Stauch, Benjamin
Overington, John P. - Abstract:
- Abstract : The challenge of translating the huge amount of genomic and biochemical data into new drugs is a costly and challenging task. Historically, there has been comparatively little focus on linking the biochemical and chemical worlds. To address this need, we have developed ChEMBL, an online resource of small-molecule SAR (structure–activity relationship) data, which can be used to support chemical biology, lead discovery and target selection in drug discovery. The database contains the abstracted structures, properties and biological activities for over 700000 distinct compounds and in excess of more than 3 million bioactivity records abstracted from over 40000 publications. Additional public domain resources can be readily integrated into the same data model (e.g. PubChem BioAssay data). The compounds in ChEMBL are largely extracted from the primary medicinal chemistry literature, and are therefore usually 'drug-like' or 'lead-like' small molecules with full experimental context. The data cover a significant fraction of the discovery of modern drugs, and are useful in a wide range of drug design and discovery tasks. In addition to the compound data, ChEMBL also contains information for over 8000 protein, cell line and whole-organism 'targets', with over 4000 of those being proteins linked to their underlying genes. The database is searchable both chemically, using an interactive compound sketch tool, protein sequences, family hierarchies, SMILES strings, compoundAbstract : The challenge of translating the huge amount of genomic and biochemical data into new drugs is a costly and challenging task. Historically, there has been comparatively little focus on linking the biochemical and chemical worlds. To address this need, we have developed ChEMBL, an online resource of small-molecule SAR (structure–activity relationship) data, which can be used to support chemical biology, lead discovery and target selection in drug discovery. The database contains the abstracted structures, properties and biological activities for over 700000 distinct compounds and in excess of more than 3 million bioactivity records abstracted from over 40000 publications. Additional public domain resources can be readily integrated into the same data model (e.g. PubChem BioAssay data). The compounds in ChEMBL are largely extracted from the primary medicinal chemistry literature, and are therefore usually 'drug-like' or 'lead-like' small molecules with full experimental context. The data cover a significant fraction of the discovery of modern drugs, and are useful in a wide range of drug design and discovery tasks. In addition to the compound data, ChEMBL also contains information for over 8000 protein, cell line and whole-organism 'targets', with over 4000 of those being proteins linked to their underlying genes. The database is searchable both chemically, using an interactive compound sketch tool, protein sequences, family hierarchies, SMILES strings, compound research codes and key words, and biologically, using a variety of gene identifiers, protein sequence similarity and protein families. The information retrieved can then be readily filtered and downloaded into various formats. ChEMBL can be accessed online at https://www.ebi.ac.uk/chembldb . … (more)
- Is Part Of:
- Biochemical Society transactions. Volume 39:Number 5(2011)
- Journal:
- Biochemical Society transactions
- Issue:
- Volume 39:Number 5(2011)
- Issue Display:
- Volume 39, Issue 5 (2011)
- Year:
- 2011
- Volume:
- 39
- Issue:
- 5
- Issue Sort Value:
- 2011-0039-0005-0000
- Page Start:
- 1365
- Page End:
- 1370
- Publication Date:
- 2011-09-21
- Subjects:
- bioactivity data -- ChEMBL -- data-mining -- drug discovery -- structure–activity relationship (SAR)
Biochemistry -- Congresses
572 - Journal URLs:
- https://portlandpress.com/biochemsoctrans ↗
- DOI:
- 10.1042/BST0391365 ↗
- Languages:
- English
- ISSNs:
- 0300-5127
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library HMNTS - ELD Digital store
- Ingest File:
- 15910.xml