Question: RNA-seq batch correction using technical replicates profiled across batches
0
gravatar for enricoferrero
3 months ago by
enricoferrero570
Switzerland
enricoferrero570 wrote:

Hello,

I have the following (certainly not ideal) RNA-seq experimental design:

  • Batch 1 contains 40 samples with condition A + 12 samples with condition B
  • Batch 2 contains 40 samples with condition C + the same 12 samples with condition B

So, the 12 samples with condition B are technical replicates that have been profiled across the two batches.

I'm actually not interested in condition B; I need to compare the 40 samples with condition A from batch 1 with the 40 samples with condition C from batch 2.

Are there Bioconductor packages (or other methods/approaches) that will allow me to use the 12 technical replicates profiled across batches to correct for batch effects before performing a differential expression analysis?

I already came across RUVSeq (see this question) and I'm looking for alternative approaches.

Thank you!

limma edger sva deseq2 ruvseq • 185 views
ADD COMMENTlink modified 3 months ago by Gordon Smyth37k • written 3 months ago by enricoferrero570
Answer: RNA-seq batch correction using technical replicates profiled across batches
2
gravatar for Gordon Smyth
3 months ago by
Gordon Smyth37k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth37k wrote:

I would use limma for an experiment like this. I would analyse all the samples together, including a batch effect term in the linear model and using duplicateCorrelation() to link the technical replicates of the same samples. (The duplicateCorrelation block variable is the same as the sample ID.) Differential expression can then be done normally between A and C and the batch correction will happen automatically.

The above model gets around the fact that conditions A and C do not occur together in the same batch. Condition A is compared with condition B within batch 1 and condition C is compared with B within batch 2. The difference between A and C, which is what you eventually want, is inferred from A - B as compared to C - B.

ADD COMMENTlink modified 3 months ago • written 3 months ago by Gordon Smyth37k

Thanks Gordon.

Please correct me if I'm wrong, but I think a potential problem with this approach is that batch is perfectly confounded with condition so it would not be possible to fit a model of the kind ~ batch + condition.

This is why I would like to use the technical replicates profiled across batches to correct for the batch effect so that then I'd be able to fit a model of the kind ~ condition.

Would using duplicateCorrelation() and fitting a ~condition model be appropriate here?

ADD REPLYlink written 3 months ago by enricoferrero570
2

No, batch is not perfectly confounded with condition, because condition B is in both batches and, more than that, the exact same samples are in both batches. Presumably, the whole purpose of repeating the condition B samples was in order to deconfound the batches.

What I have suggested to you does exactly what you say you want to do -- it uses the technical replicates to do the batch correction, but in a organic way rather than ad hoc.

ADD REPLYlink modified 3 months ago • written 3 months ago by Gordon Smyth37k

Awesome, thanks Gordon. I will do as you suggest and report back if I run into problems.

ADD REPLYlink written 3 months ago by enricoferrero570
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 140 users visited in the last hour