DEseq2: working with unbalanced number of sample in tumor study?
1
0
Entering edit mode
Cihat • 0
@Cihat-24724
Last seen 3.1 years ago

I'd like to apply deseq2 to breast cancer RNAseq expression data to compare metastasis vs non-metastasis patient groups. I have 880 samples in the non-metastasis and only 20 samples in the metastasis group. I was searching for if such sample size differences would make sense to use deseq2 (or any other differentially expressed gene analysis) however could not find many resources to justify my study.

I only came across few biostar and bioconductor messages questioning, for example, use of 15vs3 samples. In general, as far as I understood, Deseq2 works okay with unbalanced sample size, but would it be true for a 20 vs 880 sample comparison case?

I also did a PubMed search, as far as I can see there are not any studies tackling such a problem.

thank you in advance

Deseq2 RNASeqRData unbalancedsamplesize • 1.0k views
ADD COMMENT
4
Entering edit mode
@mikelove
Last seen 12 hours ago
United States

There is no problem with the balance, but I would tend to use limma-voom for analyses with 100s of bulk RNA-seq samples, as it is much faster. I like to use DESeq2 for its Bayesian moderation of fold change in particular, but that is not relevant with sample size this high.

ADD COMMENT
0
Entering edit mode

Michael, thank you so much for your quick reply.

ADD REPLY
0
Entering edit mode

Hi, I have a similar problem in my analysis: 35 vs 800 samples. I found several posts where you have commented that "There is no problem with the balance" for DESeq2. I found this figure on comparison of 3 vs 3 and 2 vs 3 samples in one of your replies. but do you have a literature reference supporting your statement in case of highly imbalanced datasets? Thank you in advance.

ADD REPLY
0
Entering edit mode

It's just that there is no breakdown point for linear models with imbalanced data. The estimates are not biased, although you lose efficiency (power).

ADD REPLY

Login before adding your answer.

Traffic: 806 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6