DESeq2 performance information for counts tables with very large rows number
2
0
Entering edit mode
atisou • 0
@atisou-7468
Last seen 6.9 years ago
Switzerland

Hello,

I am testing some different approaches to analyse differential reads counts *per positions*, between 2 conditions with 2 replicates each.  I was thinking of using DESeq2, but modified in a way that the reads counts table would contain positions instead of genes.

Therefore I would end up with a count matrix having, say 1 millions rows (instead of some thousands of rows as in the counts table usually produced by htseq-count).

What would be the performances/CPU times in this case?

Any info for parallelized computation is welcome. I know there is the option to set up parallel = TRUE by setting library(BiocParallel), register(MultiCoreParam(x)).

For instance, would that be still worth it if I would, say, use x=16, or x=32  ?

 

Thanks

 

Hatice

 

deseq2 performance • 1.4k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 47 minutes ago
United States

Firstly, you should also check out the derfinder package which already implements per-position differential testing.

http://biostatistics.oxfordjournals.org/content/15/3/413.long

http://www.biorxiv.org/content/early/2015/02/19/015370.abstract

DESeq2 will scale linearly with number of rows. You can expect typical reductions in running time from parallelization (that is, expect some reduction but not scaling with # cores due to parallelization overhead)Note that this is definitely *not* the recommended usage of DESeq2: counts of fragments aligned to unique features. You will have strong correlations across positions within a feature, and this may violate the procedures which share information across genes for dispersion and fold change estimation. Basically, you would be going off course as far as DESeq2 usage and it's up to you to prove that your implementation is reasonable. In addition, you would probably want to try something faster like limma on per-position coverage (but again this is already implemented in derfinder).

ADD COMMENT
0
Entering edit mode
atisou • 0
@atisou-7468
Last seen 6.9 years ago
Switzerland

Of course you are right and we are aware of these issues (we are not going to make DESeq2 say more than its purpose :) we are only doing some preliminary methods trials).

"DESeq2 will scale linearly with number of rows." That's what I wanted to know.

Thanks for your lights.

H.

ADD COMMENT

Login before adding your answer.

Traffic: 831 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6