Search
Question: DEseq2: any problem with unbalanced number of sample in normal/tumor study?
0
gravatar for bharata1803
4 weeks ago by
bharata180320
Japan
bharata180320 wrote:

Hello,

I have downloaded TCGA datasets (htseq count file) for several cancer disease. I realized that each dataset has large number of tumor sample but not the normal sample. For example only 60 samples normal and up to ~500 or more tumor samples. Will this unbalance sample cause any problem if I use DEseq2 to get the differentially expressed gene profile? Thank you veru much.

ADD COMMENTlink modified 4 weeks ago by Michael Love15k • written 4 weeks ago by bharata180320

I don't believe there will be any major problem due to imbalance; I'd be more worried about lack of matched tumour:normal samples (seems unlikely that they've taken 9 tumour samples from each patient providing a normal), but that's the nature of public clinical data.

ADD REPLYlink written 4 weeks ago by Gavin Kelly510
0
gravatar for Michael Love
4 weeks ago by
Michael Love15k
United States
Michael Love15k wrote:

It's not a problem for DESeq2 to have unbalanced sample sizes.

Note that with more than 100 samples per group, there is a substantial speed-up from using a linear model, such as limma-voom, instead of a generalized linear model. I tend to use limma when I have hundreds of samples per group.

ADD COMMENTlink written 4 weeks ago by Michael Love15k

I am not in a hurry and my computer is quite good. For almost 600 samples, it took around 1 hour so I think no problem. As for getting the log transform of read count for expression level from the sample, maybe it will take really long time. In this post : DESeq2 rlog function takes too long I have asked this problem and you gave some tweak. I tried that code long time ago and had some increase in speed. I will try that again now. Thank you.

ADD REPLYlink written 4 weeks ago by bharata180320

That tweak is now a fully supported function (I'll make a note on that post):

vsd <- vst(dds, blind=FALSE)
ADD REPLYlink written 4 weeks ago by Michael Love15k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 174 users visited in the last hour