Comparing RNA-Seq expression data from different sequencers (HiSeq3000 and NovaSeq6000) - are there methods for batch effect corrections?
1
0
Entering edit mode
JP Carter ▴ 40
@jp-carter-15371
Last seen 12 months ago
Nashville, TN

Hi everyone,

We are comparing data (HiSeq3000) we collected a few years back with newer data (NovaSeq6000) and are using DESeq2 for pairwise comparisons.

While we expected some variation due to experimental setup (years apart, different operators, etc...), we are seeing very significant differences in biological replicates as a measure of sequencer used. Perhaps the differences are solely the result of biological/technical variation in sample preparation, but we can't exclude differences in library preparation and sequencing.

  1. Has anyone else observed major differences between sequencers in this regard?
  2. Are there methods we can use to try to correct for any such differences?

Thanks!

JP

deseq2 rna-seq illumina batch effect • 1.4k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 10 hours ago
United States

What's your ultimate goal? To test across condition? In the vignette, we use ~type + condition to demonstrate how to control for different batches or sequencing runs while looking for consistent differences.

ADD COMMENT
0
Entering edit mode

Hi Michael,

The goal is indeed to test across condition. I had tried the ~ type + condition but it not appear to make any obvious differences downstream (at least not in sample and gene clustering plots).

Below is a simplified sampleTable to show the main features (shown as n=2, but in fact we have n=4):

sampleName type condition
1 Hiseq3000 WT-males
2 Hiseq3000 WT-males
3 NovaSeq6000 WT-females
4 NovaSeq6000 WT-females
5 Hiseq3000 WT-Treament-males
6 NovaSeq6000 WT-Treatment-males
5 Hiseq3000 WT-Treatment-females
6 NovaSeq6000 WT-Treatment-females
7 NovaSeq6000 KO-males
8 Hiseq3000 KO-males
9 NovaSeq6000 KO-females
10 Hiseq3000 KO-females
11 NovaSeq6000 KO-Treatment-males
11 NovaSeq6000 KO-Treatment-males
12 NovaSeq6000 KO-Treatment-males
13 NovaSeq6000 KO-Treatment-females

Our design setup is:

ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = ~ type + condition)

In our sample clustering, we see our condition samples cluster (as expected) only if they are from the same sequencers. The plot is split cleanly across the sequencer type, then the samples cluster within these groups.

I've run the design with and without the "type" - and see nothing different. I appreciate any insight and I apologize if I'm missing, or mis-understanding, anything obvious.

Thank you for your initial response as well :)

JP

ADD REPLY
0
Entering edit mode

The transformations like vst() and plotPCA() do not remove variance associated with the design. So you won't see a difference using the default steps. Nevertheless, using ~type + condition will make the tests come out right.

If you want to get some idea of a PCA plot with the type effect removed, I've posted code like this before:

mat <- assay(vsd)
assay(vsd) <- removeBatchEffect(mat, vsd$batch)

Where you use removeBatchEffect from the limma package to remove shifts that can be associated with the batch (or type) variable.

ADD REPLY
0
Entering edit mode

Dear Michael,

I have a similar problem, in the above experimental design if want to see DE genes between WT-male and WT-female and as these two samples are done on two different sequencer.

Will type(sequencer) mask the DE genes between  conditions (WT-male and WT-female) ? How can I remove effect of sequencer (type) on DE genes on condition please?

Many Thanks 

ADD REPLY
0
Entering edit mode

You can't remove the difference if it is confounded with the biological variable of interest (it sounds like that's what you're saying).

ADD REPLY

Login before adding your answer.

Traffic: 637 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6