Question

DESeq2 got stuck for one night, restarting?

0

Entering edit mode

Raymond ▴ 20

@raymond-14020

Last seen 6.1 years ago

Hi,

My DESeq2 running was stucked for one night, it is normal? My dataset contains 635 human samples, this is abnormally large:

head(ddsTxi)
class: DESeqDataSet
dim: 6 635
metadata(1): version
assays(2): counts avgTxLength

dds <- DESeq(ddsTxi)
estimating size factors
Note: levels of factors in the design contain characters other than
letters, numbers, '_' and '.'. It is recommended (but not required) to use
only letters, numbers, and delimiters '_' or '.', as these are safe characters
for column names in R. [This is a message, not an warning or error]
using 'avgTxLength' from assays(dds), correcting for library size
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
Note: levels of factors in the design contain characters other than
letters, numbers, '_' and '.'. It is recommended (but not required) to use
only letters, numbers, and delimiters '_' or '.', as these are safe characters
for column names in R. [This is a message, not an warning or error]
final dispersion estimates
Note: levels of factors in the design contain characters other than
letters, numbers, '_' and '.'. It is recommended (but not required) to use
only letters, numbers, and delimiters '_' or '.', as these are safe characters
for column names in R. [This is a message, not an warning or error]

Then it got stuck....Shall I wait or should I run it again step by step? or Can I just stop it and run the last nbinomWaldTest from the current dds?

dds <- estimateSizeFactors(dds) dds <- estimateDispersions(dds) dds <- nbinomWaldTest(dds)

Regards,
Raymond

deseq2 rnaseq • 1.6k views

ADD COMMENT • link updated 6.8 years ago by Michael Love 43k • written 6.8 years ago by Raymond ▴ 20

score 0 · Answer 1 · 2018-10-04

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 20 days ago

United States

What is your design? How many levels per variable?

Usually DESeq2 doesn't take so much time unless there are dozens of variables and hundreds of samples.

> dds <- makeExampleDESeqDataSet(n=100, m=600)
> system.time({ dds <- DESeq(dds, quiet=TRUE) })
   user  system elapsed
 37.552   4.407  42.106

It should scale linearly, so for 10,000 genes, you'd expect 70 minutes using a single core.

If you use parallel=TRUE, and 10 cores, this would take probably ~10 minutes.

You can filter out lowly expressed genes to save time, or switch to using limma-voom.

In my lab we use limma-voom whenever we have hundreds of samples.

ADD COMMENT • link 6.8 years ago Michael Love 43k

0

Entering edit mode

My design Matrix is

design = ~ batch+genotype+sex+condition

where batch has 9 level, genotype has 6 levels, sex has 2 levels, and condition has 4 levels. I do not include the PMI information here, where is a continuous number.

I will try limma-voom then. Thanks, Micheal!

ADD REPLY • link 6.8 years ago Raymond ▴ 20