GeneMark-EP+: eukaryotic gene prediction with self-training in the space of genes and proteins. Issue 2 (13th May 2020)
- Record Type:
- Journal Article
- Title:
- GeneMark-EP+: eukaryotic gene prediction with self-training in the space of genes and proteins. Issue 2 (13th May 2020)
- Main Title:
- GeneMark-EP+: eukaryotic gene prediction with self-training in the space of genes and proteins
- Authors:
- Brůna, Tomáš
Lomsadze, Alexandre
Borodovsky, Mark - Abstract:
- Abstract: We have made several steps toward creating a fast and accurate algorithm for gene prediction in eukaryotic genomes. First, we introduced an automated method for efficient ab initio gene finding, GeneMark-ES, with parameters trained in iterative unsupervised mode. Next, in GeneMark-ET we proposed a method of integration of unsupervised training with information on intron positions revealed by mapping short RNA reads. Now we describe GeneMark-EP, a tool that utilizes another source of external information, a protein database, readily available prior to the start of a sequencing project. A new specialized pipeline, ProtHint, initiates massive protein mapping to genome and extracts hints to splice sites and translation start and stop sites of potential genes. GeneMark-EP uses the hints to improve estimation of model parameters as well as to adjust coordinates of predicted genes if they disagree with the most reliable hints (the -EP+ mode). Tests of GeneMark-EP and -EP+ demonstrated improvements in gene prediction accuracy in comparison with GeneMark-ES, while the GeneMark-EP+ showed higher accuracy than GeneMark-ET. We have observed that the most pronounced improvements in gene prediction accuracy happened in large eukaryotic genomes.
- Is Part Of:
- NAR genomics and bioinformatics. Volume 2:Issue 2(2020)
- Journal:
- NAR genomics and bioinformatics
- Issue:
- Volume 2:Issue 2(2020)
- Issue Display:
- Volume 2, Issue 2 (2020)
- Year:
- 2020
- Volume:
- 2
- Issue:
- 2
- Issue Sort Value:
- 2020-0002-0002-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-05-13
- Subjects:
- Genomics -- Periodicals
Bioinformatics -- Periodicals
572.8 - Journal URLs:
- http://www.oxfordjournals.org/ ↗
https://academic.oup.com/nargab ↗ - DOI:
- 10.1093/nargab/lqaa026 ↗
- Languages:
- English
- ISSNs:
- 2631-9268
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 15067.xml