Question: DESeq2 with many samples
1
gravatar for igor
3.8 years ago by
igor20
United States
igor20 wrote:

I am trying to run DESeq2 with many samples (single-cell experiment). However, it seems to be choking on the amount of data. When I run DESeq() with parallel=TRUE, I quickly get a memory error (this does not happen for me with smaller experiments). If I run it with parallel=F, I think it just stalls, since it does not finish for over a day. Is there any way around that? Are there some other parameters I can adjust?

deseq2 • 3.6k views
ADD COMMENTlink modified 3.8 years ago by Michael Love24k • written 3.8 years ago by igor20
Answer: DESeq2 with many samples
3
gravatar for Michael Love
3.8 years ago by
Michael Love24k
United States
Michael Love24k wrote:

My typical recommendation when users have 100s of biological replicates within each condition is to use limma + voom, which runs very quickly. In our testing, the differences between our method and linear methods like limma or non-parametric methods like SAMseq disappeared as the number of biological replicates within each condition grew large (as you would expect).

In regards to the memory problem: I did learn in an earlier support thread that, if you run BiocParallel (parallel=TRUE), it can benefit to first clear the R environment of any large data objects which are not needed. R's internal memory garbage collector, gc(), may be making copies of the large objects in the environment in all of the workers.

C: Memory usage for bplapply -- is the entire environment copied over to each worke

ADD COMMENTlink written 3.8 years ago by Michael Love24k

I will look into limma + voom. Is there a way to do the normalization with limma + voom and then bring the results back into DESeq2? I still like the rest of the DESeq functions and how they behave.

ADD REPLYlink written 3.8 years ago by igor20

No, you can't split an analysis to be done by a mix of voom and DESeq2. Which functions from DESeq2 are you referring to when you say "I still like the rest of the DESeq functions and how they behave"?

We could show you what the equivalent moves would be on the voom/limma side given the DESeq2 mojo you want to replicate.

ADD REPLYlink written 3.8 years ago by Steve Lianoglou12k

I have not used voom, so I am not sure how many issues I will run into.

One example would be varianceStabilizingTransformation, which I believe is DESeq2-specific.

ADD REPLYlink written 3.8 years ago by igor20

Well you could try my suggestion regarding cleaning up the environment before running DESeq().

Note that single cell RNA-seq is expected to have peculiar distributional properties unlike the bulk RNA-seq experiments which DESeq2 was designed for. I'm not an expert on which packages are best for single cell analysis. At the least, you should be looking at the data a lot by eye, e.g. with plotCounts().

"Is there a way to do the normalization with limma + voom and then bring the results back into DESeq2"

Not really. The data objects are structured differently. Some functions, like plotMA() in DESeq2, are just directly calling a function plotMA() defined in another Bioconductor package, geneplotter. This function takes a data.frame as input, see the help page for ?plotMA in geneplotter. Other of our plotting functions are quite simple and you can tweak the code yourself if you are familiar with R.

https://github.com/Bioconductor-mirror/DESeq2/blob/master/R/plots.R

ADD REPLYlink written 3.8 years ago by Michael Love24k

I am also using a single-cell specific package, but it requires normalized values as input, which is why I still need to use a more traditional package first.

ADD REPLYlink written 3.8 years ago by igor20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 120 users visited in the last hour