Kmerator Suite: design of specific k-mer signatures and automatic metadata discovery in large RNA-seq datasets. (23rd June 2021)
- Record Type:
- Journal Article
- Title:
- Kmerator Suite: design of specific k-mer signatures and automatic metadata discovery in large RNA-seq datasets. (23rd June 2021)
- Main Title:
- Kmerator Suite: design of specific k-mer signatures and automatic metadata discovery in large RNA-seq datasets
- Authors:
- Riquier, Sébastien
Bessiere, Chloé
Guibert, Benoit
Bouge, Anne-Laure
Boureux, Anthony
Ruffle, Florence
Audoux, Jérôme
Gilbert, Nicolas
Xue, Haoliang
Gautheret, Daniel
Commes, Thérèse - Abstract:
- Abstract: The huge body of publicly available RNA-sequencing (RNA-seq) libraries is a treasure of functional information allowing to quantify the expression of known or novel transcripts in tissues. However, transcript quantification commonly relies on alignment methods requiring a lot of computational resources and processing time, which does not scale easily to large datasets. K -mer decomposition constitutes a new way to process RNA-seq data for the identification of transcriptional signatures, as k -mers can be used to quantify accurately gene expression in a less resource-consuming way. We present the Kmerator Suite, a set of three tools designed to extract specific k -mer signatures, quantify these k -mers into RNA-seq datasets and quickly visualize large dataset characteristics. The core tool, Kmerator, produces specific k -mers for 97% of human genes, enabling the measure of gene expression with high accuracy in simulated datasets. KmerExploR, a direct application of Kmerator, uses a set of predictor gene-specific k -mers to infer metadata including library protocol, sample features or contaminations from RNA-seq datasets. KmerExploR results are visualized through a user-friendly interface. Moreover, we demonstrate that the Kmerator Suite can be used for advanced queries targeting known or new biomarkers such as mutations, gene fusions or long non-coding RNAs for human health applications.
- Is Part Of:
- NAR genomics and bioinformatics. Volume 3:issue 3(2021)
- Journal:
- NAR genomics and bioinformatics
- Issue:
- Volume 3:issue 3(2021)
- Issue Display:
- Volume 3, Issue 3 (2021)
- Year:
- 2021
- Volume:
- 3
- Issue:
- 3
- Issue Sort Value:
- 2021-0003-0003-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-06-23
- Subjects:
- Genomics -- Periodicals
Bioinformatics -- Periodicals
572.8 - Journal URLs:
- http://www.oxfordjournals.org/ ↗
https://academic.oup.com/nargab ↗ - DOI:
- 10.1093/nargab/lqab058 ↗
- Languages:
- English
- ISSNs:
- 2631-9268
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 17401.xml