Question

Deseq2 with imbalanced design

0

Entering edit mode

Alex • 0

@5cf42eef

Last seen 2.3 years ago

United States

I know that there have been a number of questions asked on this topic already, but these questions seem to ask what happens in the case of 5 samples vs 3 samples, or even 150 vs 3. My question is regarding an experimental design comparing two experiments in which there are unique timepoints to the each experiment. That is to say that in experiment 1, i have a 12-hr timepoint that does not exist in experiment 2, and in experiment 2 i have a 3-day timepoint that does not exist in experiment 1. Aside from this, I have 3 overlapping timepoints - baseline, d1, and d2 timepoints.

My understanding of DeSeq2 is that as long as the biology is maintained (that is to say that these samples were treated the same), then the more information I feed into it, the more accurate a model it can generate. So I am inclined to feed Deseq2 all of the timepoints, but perform differential expression analysis limited to only the overlapping timepoints.

My questions are:

Is this valid? Will I produce a more accurate model if I feed in Deseq2 all of the RNAseq samples i have?
Will I be able to make perform differential expression analysis on non-overlapping timepoints if normalized to a shared control timepoint (baseline/untreated)? Will the integration of the 2 experiments still eliminate batch effects, even amongst samples that are non-overlapping?

thanks!

Deseq • 984 views

ADD COMMENT • link 2.4 years ago Alex • 0

score 1 · Accepted Answer · 2022-03-16

1

Entering edit mode

Michael Love 42k

@mikelove

Last seen 10 hours ago

United States

This has been asked, but it's probably impossible to find the answer.

I don't see a great value in adding a sample that also gets its own coefficient. It doesn't inform the dispersion which would be the benefit of an added sample. You can do it but it doesn't help, and I don't recommend it.

For further examination of how to set up the design with time points and batches, I'd recommend working with a statistician. This is an important part of the process, and I just don't have sufficient time to provide statistical consulting on the support site, I have to restrict to software related questions about the packages I support.

ADD COMMENT • link 2.4 years ago Michael Love 42k

0

Entering edit mode

thanks for taking the time to reply! I read a few posts on it but i wasn't sure so I thought I would ask.

ADD REPLY • link 2.4 years ago Alex • 0