Entering edit mode
I used DEXSeq to analyze differential exon usage form RNAseq data, but found its output included non-exist exons.
For example, there is a gene of 6 exons in total, but DEXSeq told me exon-15 is significantly differentially used.
I checked the genomic position of this exon-15, as provided in the DEXSeq output, and found it is only 2-nucleotide long, inside 5-UTR of that gene. I went back to GTF data, and this 2-nt exon is not there.
Anyone encountered the same problem? Anyone knows how to fix it?
Thanks very much!
I updated R, re-installed DEXSeq, and tried ENSEMBL GTF (recommended by the author), but the problem is still here: a lot of non-exist exons were recognized as exons.
Hello. Are you aware of the preprocessing of the transcript annotations that is done for DEXSeq? See for example the publication:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3460195/figure/F1/
This could be giving as input a 2-nucleotide long disjoint bin of your annotation model. Note that you can process the annotation as you want and provide your self-processed counting bins to the software.
Thank you very much, Alejandro.
I read this information and agreed with this strategy in the beginning, but then I found some exons are too much fragmented.
I just found another single exon being split into 50+ small pieces.
I am planning to follow your GFF format, and write a script to re-generate the counting bin for annotated exons only. Hope it would work.
Thanks very much again!!
Hi, Alejandro,
Just one comment to add: DEXSeq output uses Exon to call each bin, however, each bin could be (1) a complete exon, or (2) a partial exon split by DEXSeq preprocessing. Therefore, using the word Exon could be confusing. In my case, exon-15 was told to be significant, in a gene of 6 exons.
If there will be an update of the code, it might be better to make some adjustments?
Thanks!
Thanks for your feedback!