Discrepancy in read counts of XBP1 (ENSG00000100219) transcript isoforms
0
0
Entering edit mode
kvoutet • 0
@kvoutet-20053
Last seen 4.5 years ago

I have downloaded and analyzed in R the rsetxlung.Rdata from recount2 and I noticed that there are no read counts for the unspliced XBP1 transcript isoform (ENST00000216037) which is one of the dominant isoforms of this gene. This is the 1st mRNA product of XBP1 that has been studied thoroughly and it is called XBP1u. Just for your information, it has been found that upon accumulation of unfolded proteins in the endoplasmic reticulum (ER), the mRNA of this gene is processed to an active form by an unconventional splicing mechanism that is mediated by the endonuclease inositol-requiring enzyme 1 (IRE1). The resulting loss of 26 nt from the spliced mRNA causes a frame-shift and an isoform XBP1(S)-ENST00000611155, which is the functionally active transcription factor. The isoform encoded by the unspliced mRNA, XBP1(U) - ENST00000216037 , is constitutively expressed, and thought to function as a negative feedback regulator of XBP1(S). However, in rsetxlung.Rdata , there are read counts for the five out of seven XBP1 transcript isoforms but not for the unspliced isoform "ENST00000216037". Could it be a wrong assignment by recountNNLS for XBP1 gene? I am interested in estimating the ratio between XBP1s and XBP1u isoform and as you can understand I am not able since the unspliced isoform has 0 read counts. I would kindly ask you for your precious help on that..Any useful information would be appreciated.

recountNNLS bug non-canonical splicing in RNAseq data • 653 views
ADD COMMENT
0
Entering edit mode

Reply by Jack Fu When the possible transcripts for a particular gene are very similar in structure and are mathematically impossible to assign reads accurately to certain transcripts, they are omitted from our linear modeling. I would suggest to look at the jxbed and jxcov files for the TCGA data to determine if there are any unique splice junctions that are present only in XBP1u and not in any other isoforms. If there are unique splice junctions that are expressed, you could attempt some simple comparison of splice junction coverage of the different isoforms for the XBP1u gene

ADD REPLY
0
Entering edit mode

Thank you very much for your reply. I have already used the snaptronquery() to retrieve the specific exon-exon junction for the spliced XBP1s isoform (that is, the 26nt-intron for the unspliced XBP1u isoform) giving start-end coordinates but without success...it did not find this junction. it returns no matches... Surprisingly, the recountNNLS has been assigned reads to the spliced XBP1s isoform which has got this intron between 4 and 5 exon.....I do not know what is going on.. I have also downloaded the jxbed and jxcov files for the TCGA data... Do you think it is possible this junction to be present in these files although it was absent with snaptronquery()?

ADD REPLY
0
Entering edit mode

Reply by Leonardo Collado Torres If you are looking for an exact exon-exon junction, recount::snaptron_query() will retrieve it from the jx files. Christopher Wilks (cc'ed) has been working on another interface to these files through R which he can tell you more about.

ADD REPLY
0
Entering edit mode

Unfortunately, as I have already told you above, the recount::snaptron_query() returns no matches for the specific junction...How could I measure this XBP1 intron of 26nts with non-canonical splice sites that IRE1 recognizes for cleavage, in TCGA-RNAseq data?Any help?

ADD REPLY
0
Entering edit mode

Reply by Chris Whilks The splice junctions in recount2 and in Snaptron for TCGA should be the same.

ADD REPLY
0
Entering edit mode

Reply byChris Whilks

I looked for that intron in Snaptron, you're right, there's no junctions that match that exact coordinate start/end in any of the ~70K sequence runs we have represented there.

The coordinates you sent me would produce splice sites with a CA-GC non-canonical motif (a fact you mentioned earlier in the thread).

I can tell you for certain that that intron wouldn't be in either recount2 or Snaptron as the spliced-aligner used to produce the splice junction calls for both was the Rail-RNA aligner which only looks for GT-AG, GC-AG, and, AT-AC (and their reverse complements).

This is a known limitation and we're planning on addressing it in the near future with recount3/Snaptron2.

However, allowing for some "fuzz" around the splice site coordinates, I see a fairly well supported, unannotated intron 1bp shift to the right of the coordinates you sent, and in the correct orientation for XBP1 (reverse strand):

chr22:28796124-28796150

This junction has broad sample support in both GTEx and our general SRAv2 compilation but also some in TCGA. While it's certainly possible there's two unannotated introns here right next to each other, I thought it was still worth mentioning.

ADD REPLY

Login before adding your answer.

Traffic: 1034 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6