qCount with a GRangesList is very slow
Entering edit mode
Last seen 6.6 years ago
United States


I'm using QuasR package to process some RNAseq I have and count reads in exonic and intronic regions of all genes in the human genome. To this end I have aligned the reads and use qCount to carry out my counting using a GrangesList as the query. The list an entry for each gene and the entry contains either the exonic/intronic ranges. my code snippet is below:

clusters = makeForkCluster(nnodes = 8)
eCount = qCount(proj,exons,clObj = clusters)
stopCluster(cl = clusters)

However this is taking abnormally long to run, which I think is because qCount uses a for loop to loop over all elements of the list and remove redundancies using setdiff(). Is there a way that I can speed up this redundancy removal step, I have ~20000 genes (elements in the list) and the step of removing redundancies isn't complete even after ~40 hours.


I'll be grateful for any pointers.

Thanking You,


QuasR qCount GRangesList • 1.0k views
Entering edit mode

Hi Vakul

You should probably rather use a GRanges query, instead of a GRangesList.

The GRangesList query is meant for a special analysis (see ?qCount) which partitions the genome into domains.

If you want one count per exon, use a GRanges with exons without names or with unique names per exon. If you want one count per gene (creating a union of all exon), use a GRanges with exons, named by genes (all exons from the same gene have the identical names).

This should be much faster.



Login before adding your answer.

Traffic: 451 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6