Fusion transcripts and their genomic breakpoints in polyadenylated and ribosomal RNA–minus RNA sequencing data. Issue 12 (9th December 2021)
- Record Type:
- Journal Article
- Title:
- Fusion transcripts and their genomic breakpoints in polyadenylated and ribosomal RNA–minus RNA sequencing data. Issue 12 (9th December 2021)
- Main Title:
- Fusion transcripts and their genomic breakpoints in polyadenylated and ribosomal RNA–minus RNA sequencing data
- Authors:
- Hoogstrate, Youri
Komor, Malgorzata A
Böttcher, René
van Riet, Job
van de Werken, Harmen J G
van Lieshout, Stef
Hoffmann, Ralf
van den Broek, Evert
Bolijn, Anne S
Dits, Natasja
Sie, Daoud
van der Meer, David
Pepers, Floor
Bangma, Chris H
van Leenders, Geert J L H
Smid, Marcel
French, Pim J
Martens, John W M
van Workum, Wilbert
van der Spek, Peter J
Janssen, Bart
Caldenhoven, Eric
Rausch, Christian
de Jong, Mark
Stubbs, Andrew P
Meijer, Gerrit A
Fijneman, Remond J A
Jenster, Guido W - Abstract:
- Abstract: Background: Fusion genes are typically identified by RNA sequencing (RNA-seq) without elucidating the causal genomic breakpoints. However, non–poly(A)-enriched RNA-seq contains large proportions of intronic reads that also span genomic breakpoints. Results: We have developed an algorithm, Dr. Disco, that searches for fusion transcripts by taking an entire reference genome into account as search space. This includes exons but also introns, intergenic regions, and sequences that do not meet splice junction motifs. Using 1, 275 RNA-seq samples, we investigated to what extent genomic breakpoints can be extracted from RNA-seq data and their implications regarding poly(A)-enriched and ribosomal RNA–minus RNA-seq data. Comparison with whole-genome sequencing data revealed that most genomic breakpoints are not, or minimally, transcribed while, in contrast, the genomic breakpoints of all 32 TMPRSS2 - ERG –positive tumours were present at RNA level. We also revealed tumours in which the ERG breakpoint was located before ERG, which co-existed with additional deletions and messenger RNA that incorporated intergenic cryptic exons. In breast cancer we identified rearrangement hot spots near CCND1 and in glioma near CDK4 and MDM2 and could directly associate this with increased expression. Furthermore, in all datasets we find fusions to intergenic regions, often spanning multiple cryptic exons that potentially encode neo-antigens. Thus, fusion transcripts other than classicalAbstract: Background: Fusion genes are typically identified by RNA sequencing (RNA-seq) without elucidating the causal genomic breakpoints. However, non–poly(A)-enriched RNA-seq contains large proportions of intronic reads that also span genomic breakpoints. Results: We have developed an algorithm, Dr. Disco, that searches for fusion transcripts by taking an entire reference genome into account as search space. This includes exons but also introns, intergenic regions, and sequences that do not meet splice junction motifs. Using 1, 275 RNA-seq samples, we investigated to what extent genomic breakpoints can be extracted from RNA-seq data and their implications regarding poly(A)-enriched and ribosomal RNA–minus RNA-seq data. Comparison with whole-genome sequencing data revealed that most genomic breakpoints are not, or minimally, transcribed while, in contrast, the genomic breakpoints of all 32 TMPRSS2 - ERG –positive tumours were present at RNA level. We also revealed tumours in which the ERG breakpoint was located before ERG, which co-existed with additional deletions and messenger RNA that incorporated intergenic cryptic exons. In breast cancer we identified rearrangement hot spots near CCND1 and in glioma near CDK4 and MDM2 and could directly associate this with increased expression. Furthermore, in all datasets we find fusions to intergenic regions, often spanning multiple cryptic exons that potentially encode neo-antigens. Thus, fusion transcripts other than classical gene-to-gene fusions are prominently present and can be identified using RNA-seq. Conclusion: By using the full potential of non–poly(A)-enriched RNA-seq data, sophisticated analysis can reliably identify expressed genomic breakpoints and their transcriptional effects. … (more)
- Is Part Of:
- GigaScience. Volume 10:Issue 12(2021)
- Journal:
- GigaScience
- Issue:
- Volume 10:Issue 12(2021)
- Issue Display:
- Volume 10, Issue 12 (2021)
- Year:
- 2021
- Volume:
- 10
- Issue:
- 12
- Issue Sort Value:
- 2021-0010-0012-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-12-09
- Subjects:
- gene fusion -- RNA precursors -- RNA-seq -- chromosome breakage -- genomic structural variation -- cryptic exons -- TMPRSS2-ERG
Information storage and retrieval systems -- Research -- Periodicals
Biology -- Research -- Periodicals
Medical sciences -- Research -- Periodicals
Database management -- Periodicals
570.285 - Journal URLs:
- http://www.gigasciencejournal.com/ ↗
http://www.oxfordjournals.org/ ↗ - DOI:
- 10.1093/gigascience/giab080 ↗
- Languages:
- English
- ISSNs:
- 2047-217X
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20234.xml