Question

DESeq2 performance information for counts tables with very large rows number

0

Entering edit mode

atisou • 0

@atisou-7468

Last seen 8.7 years ago

Switzerland

Hello,

I am testing some different approaches to analyse differential reads counts *per positions*, between 2 conditions with 2 replicates each. I was thinking of using DESeq2, but modified in a way that the reads counts table would contain positions instead of genes.

Therefore I would end up with a count matrix having, say 1 millions rows (instead of some thousands of rows as in the counts table usually produced by htseq-count).

What would be the performances/CPU times in this case?

Any info for parallelized computation is welcome. I know there is the option to set up parallel = TRUE by setting library(BiocParallel), register(MultiCoreParam(x)).

For instance, would that be still worth it if I would, say, use x=16, or x=32 ?

Thanks

Hatice

deseq2 performance • 1.9k views

ADD COMMENT • link 9.9 years ago atisou • 0

score 1 · Answer 1 · 2016-04-11

Firstly, you should also check out the derfinder package which already implements per-position differential testing.

http://biostatistics.oxfordjournals.org/content/15/3/413.long

http://www.biorxiv.org/content/early/2015/02/19/015370.abstract

DESeq2 will scale linearly with number of rows. You can expect typical reductions in running time from parallelization (that is, expect some reduction but not scaling with # cores due to parallelization overhead). Note that this is definitely *not* the recommended usage of DESeq2: counts of fragments aligned to unique features. You will have strong correlations across positions within a feature, and this may violate the procedures which share information across genes for dispersion and fold change estimation. Basically, you would be going off course as far as DESeq2 usage and it's up to you to prove that your implementation is reasonable. In addition, you would probably want to try something faster like limma on per-position coverage (but again this is already implemented in derfinder).

score 0 · Answer 2 · 2016-04-12

0

Entering edit mode

atisou • 0

@atisou-7468

Last seen 8.7 years ago

Switzerland

Of course you are right and we are aware of these issues (we are not going to make DESeq2 say more than its purpose :) we are only doing some preliminary methods trials).

"DESeq2 will scale linearly with number of rows." That's what I wanted to know.

Thanks for your lights.

H.

ADD COMMENT • link 9.9 years ago atisou • 0