Hi everyone,
We are comparing data (HiSeq3000) we collected a few years back with newer data (NovaSeq6000) and are using DESeq2 for pairwise comparisons.
While we expected some variation due to experimental setup (years apart, different operators, etc...), we are seeing very significant differences in biological replicates as a measure of sequencer used. Perhaps the differences are solely the result of biological/technical variation in sample preparation, but we can't exclude differences in library preparation and sequencing.
- Has anyone else observed major differences between sequencers in this regard?
- Are there methods we can use to try to correct for any such differences?
Thanks!
JP
Hi Michael,
The goal is indeed to test across condition. I had tried the
~ type + condition
but it not appear to make any obvious differences downstream (at least not in sample and gene clustering plots).Below is a simplified sampleTable to show the main features (shown as n=2, but in fact we have n=4):
Our design setup is:
ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = ~ type + condition)
In our sample clustering, we see our condition samples cluster (as expected) only if they are from the same sequencers. The plot is split cleanly across the sequencer type, then the samples cluster within these groups.
I've run the design with and without the "type" - and see nothing different. I appreciate any insight and I apologize if I'm missing, or mis-understanding, anything obvious.
Thank you for your initial response as well :)
JP
The transformations like vst() and plotPCA() do not remove variance associated with the design. So you won't see a difference using the default steps. Nevertheless, using ~type + condition will make the tests come out right.
If you want to get some idea of a PCA plot with the type effect removed, I've posted code like this before:
Where you use removeBatchEffect from the limma package to remove shifts that can be associated with the batch (or type) variable.
Dear Michael,
I have a similar problem, in the above experimental design if want to see DE genes between WT-male and WT-female and as these two samples are done on two different sequencer.
Will type(sequencer) mask the DE genes between conditions (WT-male and WT-female) ? How can I remove effect of sequencer (type) on DE genes on condition please?
Many Thanks
You can't remove the difference if it is confounded with the biological variable of interest (it sounds like that's what you're saying).