Question: lfcshrink and rlog take forever to compute with a DESeqDataSet with many genotypes and conditions
0
2.1 years ago by
pecot0
pecot0 wrote:

Hi,

I'm using deseq2 for a RNAseq experiment with 2 genotypes and 5 conditions (2 replicates for each combination of genotype and condition). When I use the deseq command to perform a comparison, I get different results if all experiments are in the DESeqDataSet than when only the 2 compared experiments are in the DESeqDataSet, what is expected as the estimation steps for the model parameters look at all the samples in the DESeqDataSet. Then, if I try to apply lfcShrink with the DESeqDataSet with all experiments, it takes forever. I've waited for 24 hours and still no results. As I have many comparisons, this is not an option. Same thing with rlog. Is it normal? How should I proceed?

Thanks,

Thierry

deseq2 rlog lfcshrink • 727 views
modified 2.1 years ago • written 2.1 years ago by pecot0

I haven't tried on another machine, but if it works on yours, it means that mine has a problem. I fully reinstalled R and remove all libraries, reinstalled DESeq2, still the same... I suspect something more vicious, maybe related to GCC compiler or C++ libraries. I'll have a look at it next week, here everything is closing with Irma coming. I'll let you know when I find a solution, if I find one (hope so :).

Sorry, I didn't have the opportunity to work on it. Everything shut down here because of Irma.

I worked on a machine with Ubuntu 14.4. I tried to reinstall all packages and R, but nothing changed. I have the feeling that it's related to GCC as it says when I install DESeq2 that openMP is not working while I have gcc-4.8 and gcc-4.9 (default) installed on my computer, both enabling the use of openMP. I tried to find a way to tell R which gcc to use, but it's really not clear how it works and I couldn't find a way to do it properly. I have a virtualbox with Windows 10 so I installed R on it and I'm now able to run lfcShrink normally. It's great in the sense that I'm not stuck and I can analyze the data I have, but it's frustrating not to be able to understand why it's not working with Ubuntu. Do you have any idea?

Sorry, no ideas...

No worries, I found a way to make it work. I'll try it again when I upgrade to Ubuntu 16, never knows.

I upgraded to Ubuntu 16 and ... it works!!!!!!!!!

I can't tell why it works now, but it might be helpful for others. Michael, thanks a lot for your comments!

Answer: lfcshrink and rlog take forever to compute with a DESeqDataSet with many genotyp
0
2.1 years ago by
Michael Love25k
United States
Michael Love25k wrote:

2 * 5 * 2 = 20 samples should take at most a few minutes. Something is wrong. Can you send me the 'dds' to the email address found at maintainer("DESeq2")?

I'd recommend using vst() for variance stabilization and then making PCA plots or heatmaps. Not sure why you are experiencing more than a minute with lfcShrink:

> system.time({ vsd <- vst(dds, blind=FALSE) })
user  system elapsed
2.942   0.116   3.108

> system.time({
+ res <- results(dds, contrast=c("condition","US.RbKO","US.Wt"))
+ res <- lfcShrink(dds, contrast=c("condition","US.RbKO","US.Wt"), res=res)
+ })
user  system elapsed
51.153   1.309  53.800

I'm using DESeq2 v1.17 which is similar to the current release v1.16.

With just 'results', I get this:

system.time({
+ res <- results(dds, contrast=c("condition","US.RbKO","US.Wt"))
+ })
user  system elapsed
10.996  43.532   6.956

I tried 'lfcShrink':

system.time({
+ res <- lfcShrink(dds, contrast=c("condition","US.RbKO","US.Wt"), res=res)
+ })

It's been running for 15 minutes and still no results. With a similar comparison, I let 'lfcShrink' run over the night and I got no results. I guess it's more a system problem. Do you have any idea?

That's definitely strange. Can you try on another machine? And/or try reinstalling DESeq2?

If it takes more than 5 minutes, don't bother waiting longer, it's a problem with the installation I would say.

Above is running on a laptop using 1 core.