gDNAx diagnostic output interpretation
Entering edit mode
rina • 0
Last seen 14 days ago

Hello all,

I am trying to use gDNAx to detect gDNA contamination and filter them from my RNA-Seq data. I have tried running the package according to the vignette, and i have questions regarding the diagnostic output.

  1. SCJ is described as splice alignments overlapping a transcript in a "splice compatible" way. As with SCE being alignments without a splice site, but overlap a transcript in a "splice compatible" way. What does it mean by being "splice compatible", and how is this determined from the data?
  2. Aside from IGC (Intergenic), INT (intronic), SCJ and SCE. What are the other origins of alignments: SCC, IGCFLM, SCJFLM, SCEFLM and INTFLM?

gDNAx looks like a very promising tool, i would appreciate any details you can share.

Many thanks.

gDNAx • 156 views
Entering edit mode
Last seen 24 days ago

Hi! I am glad you are finding gDNAx useful! Regarding your questions:

  1. We use the term splice-compatible to refer to alignments that match the splicing of exons based on the annotations (i.e. alignments overlapping only the exonic regions, without including any intronic base pair). Within this category, we distinguish between SCJ (splice-compatible junction alignments) and SCE (splice-compatible exonic alignments). The key difference is that in SCJ the alignment contains a splice site present in the annotations, while SCE do not contain splice sites. However, in paired-end data, the two mates of an SCE alignment can flank one splice site, without overlapping it. I hope this image helps clear this concept: SCEandSCJ
  2. SCC alignments are those that contain a junction, considering in this case a junction as an N operation in the CIGAR of the alignment (as in njunc() from GenomicAlignments package). Therefore, the junctions present in these alignments don't necessarily match the splicing of exons according to the annotations, as opposed to SCJ. It is important to note that some alignments can be simultaneously classified as SCC and SCJ. As for IGCFLM, SCJFLM, SCEFLM and INTFLM, they apply only to paired-end data, and they represent the average fragment length (in bp) for alignments classified as intergenic, SCJ, SCE and intronic, respectively.

I hope these clarifications are useful, we'll also work on improving the vignettes to provide clearer explanations, as these concepts might not be very intuitive.


Login before adding your answer.

Traffic: 594 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6