DESeq function fast on laptop but very slow on server
1
0
Entering edit mode
Saroop • 0
@saroop-24211
Last seen 4.9 years ago

I am running DESeq2 package version : 1.30.0 I am doing a LRT on a data with 16K genes with 300 samples

On my Mac laptop the processing is fast. But the same code, same dataset, same package version of DESeq2 runs slow on a Linux server with 4 cores. The "mean-dispersion relationship" is ~20x slower.

Also I tried the PARALLEL=TRUE, it does make things faster but the server is still 10x slower. (I did note that the default number of worker threads on the server was 135, which seemed rather high).

Any advice on how I can debug this issue and make my server performance improve.

DESeq2 • 1.3k views
ADD COMMENT
0
Entering edit mode

Also, maybe the anomaly is that my laptop is not generating the correct results. Again it is the same code and same dataset but for 16K genes, my laptop produces results in a few minutes, which may indicate it is really not doing the full processing?

ADD REPLY
0
Entering edit mode

Same data, same version of software should produce identical results -- i'd look for a bug/discrepancy.

ADD REPLY
0
Entering edit mode

I reviewed the code and couldn't find any differences that account for the laptop vs server. Today I did a diff of the outputs results from the laptop and the server, and they are identical. So I think it is just an issue of performance.

I contacted my IT and they asked me to check back about the optimal Rcpp configurations - is there anything you can recommend? I looked at the documentation but didn't find anything on this topic.

(They did mention that our servers are using the new AMD processors, not sure if this makes any difference)

Rcpp and DESeq2 are the same package version on server and laptop. But I noticed on the server RcppArmadillo is ‘0.10.1.2.0’ but on my laptop it is ‘0.10.1.0.0’.

Does DESeq2 rely on RcppArmadillo? And could that be a potential reason for performance drops?

Thanks

ADD REPLY
0
Entering edit mode
@mikelove
Last seen 1 day ago
United States

If you're sure it's the same version it could be that Rcpp is not set up optimally on the cluster. You might ask the cluster IT with help for configuring compilation of C++ code on the cluster. I've had ~100x performance differences on my local cluster after working out the optimal configuration (such that I could obtain equally fast performance as on my laptop).

ADD COMMENT
0
Entering edit mode

Thanks.

Can you give me a rough sense of how long it would take to do LRT on 16,000 genes, against a disorder and control group?

full=Age+PMI+pH+Disorder reduced=Age+PMI+pH

Not sure if you can answer this question. But I will definitely send a message to my IT contacts

Thanks again.

ADD REPLY

Login before adding your answer.

Traffic: 1070 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6