Does the running time of DESeq function is much more longer for design factor containing two variable compare to design factor containing only one variable?
1
0
Entering edit mode
Sep • 0
@06de5a1f
Last seen 14 months ago
Germany

Hi,

I have a question regarding the running time of DESeq function in DESeq2.

Does the running time of DESeq function is much more longer for design factor containing two variable compare to design factor containing only one variable?

For example does the running time for the condition 1 is much more longer compare to condition 2?

Condition 1:

dds_1 <- DESeqDataSetFromMatrix(countData = bigdf_t, colData = sample_info, design = ~ subject + condition)

dds_1 <- DESeq(dds_1, parallel = TRUE)

Condition 2:

dds_1 <- DESeqDataSetFromMatrix(countData = bigdf_t, colData = sample_info, design = ~ condition)

dds_1 <- DESeq(dds_1, parallel = TRUE) ```

I would like to add that in both condition the number of samples and genes are the same.

Thanks a lot for the answer in advance.

DESeq2 • 909 views
ADD COMMENT
0
Entering edit mode

Cannot come up with a precise answer other than 'not much'. For normal sized analysis with tens to hundreds of samples that will take a few seconds unless you pump it with many covariates. It's really not much if an issue? Do you experience any problems?

ADD REPLY
0
Entering edit mode

The data comprises 100 samples and around 3 million covariates.

When I ran the code for the condition 2 ( DESeqDataSetFromMatrix(countData = bigdf_t, colData = sample_info, design = ~ condition)) it took around 5 6 hours to gave me the result, however the code is now running for around 23 hours for the condition 1 (DESeqDataSetFromMatrix(countData = bigdf_t, colData = sample_info, design = ~ subject + condition) and still I do not get any result. it is in gene-wise dispersion estimates status with out giving me any error or so...

Do you think it is ok or there is sth wrong there?

ADD REPLY
0
Entering edit mode
@mikelove
Last seen 6 hours ago
United States

Do you mean 3 million features? Can you explain what type of data you are using with DESeq2?

ADD COMMENT
0
Entering edit mode

Yes. The data is peptide profiles from pre and post SARS-CoV-2 infection samples.

ADD REPLY
0
Entering edit mode

I'm not sure this is appropriate for DESeq2, don't know anything about its distribution.

How large of counts do you have for these 3 million features?

I'd recommend limma-voom with filtering on the minimal count, without knowing if this type of data is appropriate to model with NB. It's faster and more robust to non-Negative-Binomial data.

ADD REPLY
0
Entering edit mode

The distribution is poisson-like distribution, the variance of features are larger than their mean and they measurements are counts per million. Do you believe that DESeq2 wont work this type of data?

ADD REPLY
0
Entering edit mode

For CPM you should definitely use limma-voom.

ADD REPLY

Login before adding your answer.

Traffic: 517 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6