DESeq2 got stuck for one night, restarting?
1
0
Entering edit mode
Raymond ▴ 20
@raymond-14020
Last seen 5.5 years ago

Hi,

   My DESeq2 running was stucked for one night, it is normal?  My dataset contains 635 human samples, this is abnormally large:

head(ddsTxi)
class: DESeqDataSet
dim: 6 635
metadata(1): version
assays(2): counts avgTxLength


dds <- DESeq(ddsTxi)
estimating size factors
  Note: levels of factors in the design contain characters other than
  letters, numbers, '_' and '.'. It is recommended (but not required) to use
  only letters, numbers, and delimiters '_' or '.', as these are safe characters
  for column names in R. [This is a message, not an warning or error]
using 'avgTxLength' from assays(dds), correcting for library size
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
  Note: levels of factors in the design contain characters other than
  letters, numbers, '_' and '.'. It is recommended (but not required) to use
  only letters, numbers, and delimiters '_' or '.', as these are safe characters
  for column names in R. [This is a message, not an warning or error]
final dispersion estimates
  Note: levels of factors in the design contain characters other than
  letters, numbers, '_' and '.'. It is recommended (but not required) to use
  only letters, numbers, and delimiters '_' or '.', as these are safe characters
  for column names in R. [This is a message, not an warning or error]

 

Then it got stuck....Shall I wait or should I run it again step by step? or Can I just stop it and run the last nbinomWaldTest from the current dds?

dds <- estimateSizeFactors(dds)
dds <- estimateDispersions(dds)
dds <- nbinomWaldTest(dds)

Regards,
Raymond
 

deseq2 rnaseq • 1.5k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 3 hours ago
United States

What is your design? How many levels per variable?

Usually DESeq2 doesn't take so much time unless there are dozens of variables and hundreds of samples.

> dds <- makeExampleDESeqDataSet(n=100, m=600)
> system.time({ dds <- DESeq(dds, quiet=TRUE) })
   user  system elapsed
 37.552   4.407  42.106

It should scale linearly, so for 10,000 genes, you'd expect 70 minutes using a single core.

If you use parallel=TRUE, and 10 cores, this would take probably ~10 minutes.

You can filter out lowly expressed genes to save time, or switch to using limma-voom.

In my lab we use limma-voom whenever we have hundreds of samples.

ADD COMMENT
0
Entering edit mode

My design Matrix is

design = ~ batch+genotype+sex+condition

where batch has 9 level, genotype has 6 levels, sex has 2 levels, and condition has 4 levels. I do not include the PMI information here, where is a continuous number. 

I will try limma-voom then. Thanks, Micheal!

ADD REPLY

Login before adding your answer.

Traffic: 926 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6