Question: DESeq2 for differential gene expression on GTEx dataset
gravatar for vikram
22 days ago by
vikram0 wrote:


I'm new to the field, and I'm trying to do a differential gene expression on the GTex dataset. My aim is to identify sets of genes which (with some confidence) identify each of the 50 odd tissue types in the said dataset. The dataset is (bulk) RNA-seq ~50k genes and ~12k samples. The resource I have at hand has ~50 CPU, each with 12 cores and plenty of RAM.

I have

1) browsed through the DESeq2 vignettes and I feel it may be a good fit.

2) Removed housekeeping genes, in the hope that it makes the task of the software a little easier. 3) Put the code to run

I was wondering if

1) My choice of algorithm is advisable, and

2) anyone has an estimate of how much time it may take the code to run

I'd be glad to give more details, if you need it.

Thanks for reading through. :-)

ADD COMMENTlink modified 22 days ago by Michael Love15k • written 22 days ago by vikram0
gravatar for Michael Love
22 days ago by
Michael Love15k
United States
Michael Love15k wrote:


For 100s of samples per condition/group, and with a total of thousands of samples overall, for differential expression I personally tend to switch to faster linear models, like limma-voom. The GLM has to do a lot of computation to iteratively find the solution (a beta coefficient for each tissue). This is much faster with the linear model.

ADD COMMENTlink written 22 days ago by Michael Love15k

Thanks a lot for the reply. I'll try limma-voom.

ADD REPLYlink written 21 days ago by vikram0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 353 users visited in the last hour