gDNAx diagnostic output interpretation
1
0
Entering edit mode
rina • 0
@09528fe9
Last seen 5 months ago
Brunei

Hello all,

I am trying to use gDNAx to detect gDNA contamination and filter them from my RNA-Seq data. I have tried running the package according to the vignette, and i have questions regarding the diagnostic output.

  1. SCJ is described as splice alignments overlapping a transcript in a "splice compatible" way. As with SCE being alignments without a splice site, but overlap a transcript in a "splice compatible" way. What does it mean by being "splice compatible", and how is this determined from the data?
  2. Aside from IGC (Intergenic), INT (intronic), SCJ and SCE. What are the other origins of alignments: SCC, IGCFLM, SCJFLM, SCEFLM and INTFLM?

gDNAx looks like a very promising tool, i would appreciate any details you can share.

Many thanks.

gDNAx • 380 views
ADD COMMENT
1
Entering edit mode
@25bc498b
Last seen 6 months ago
Spain

Hi! I am glad you are finding gDNAx useful! Regarding your questions:

  1. We use the term splice-compatible to refer to alignments that match the splicing of exons based on the annotations (i.e. alignments overlapping only the exonic regions, without including any intronic base pair). Within this category, we distinguish between SCJ (splice-compatible junction alignments) and SCE (splice-compatible exonic alignments). The key difference is that in SCJ the alignment contains a splice site present in the annotations, while SCE do not contain splice sites. However, in paired-end data, the two mates of an SCE alignment can flank one splice site, without overlapping it. I hope this image helps clear this concept: SCEandSCJ
  2. SCC alignments are those that contain a junction, considering in this case a junction as an N operation in the CIGAR of the alignment (as in njunc() from GenomicAlignments package). Therefore, the junctions present in these alignments don't necessarily match the splicing of exons according to the annotations, as opposed to SCJ. It is important to note that some alignments can be simultaneously classified as SCC and SCJ. As for IGCFLM, SCJFLM, SCEFLM and INTFLM, they apply only to paired-end data, and they represent the average fragment length (in bp) for alignments classified as intergenic, SCJ, SCE and intronic, respectively.

I hope these clarifications are useful, we'll also work on improving the vignettes to provide clearer explanations, as these concepts might not be very intuitive.

ADD COMMENT

Login before adding your answer.

Traffic: 514 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6