Question: RNA-seq batch correction using technical replicates profiled across batches
0
4 weeks ago by
enricoferrero570
Switzerland
enricoferrero570 wrote:

Hello,

I have the following (certainly not ideal) RNA-seq experimental design:

• Batch 1 contains 40 samples with condition A + 12 samples with condition B
• Batch 2 contains 40 samples with condition C + the same 12 samples with condition B

So, the 12 samples with condition B are technical replicates that have been profiled across the two batches.

I'm actually not interested in condition B; I need to compare the 40 samples with condition A from batch 1 with the 40 samples with condition C from batch 2.

Are there Bioconductor packages (or other methods/approaches) that will allow me to use the 12 technical replicates profiled across batches to correct for batch effects before performing a differential expression analysis?

I already came across RUVSeq (see this question) and I'm looking for alternative approaches.

Thank you!

limma edger sva deseq2 ruvseq • 123 views
modified 4 weeks ago by Gordon Smyth36k • written 4 weeks ago by enricoferrero570
Answer: RNA-seq batch correction using technical replicates profiled across batches
2
4 weeks ago by
Gordon Smyth36k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth36k wrote:

I would use limma for an experiment like this. I would analyse all the samples together, including a batch effect term in the linear model and using duplicateCorrelation() to link the technical replicates of the same samples. (The duplicateCorrelation block variable is the same as the sample ID.) Differential expression can then be done normally between A and C and the batch correction will happen automatically.

The above model gets around the fact that conditions A and C do not occur together in the same batch. Condition A is compared with condition B within batch 1 and condition C is compared with B within batch 2. The difference between A and C, which is what you eventually want, is inferred from A - B as compared to C - B.

Thanks Gordon.

Please correct me if I'm wrong, but I think a potential problem with this approach is that batch is perfectly confounded with condition so it would not be possible to fit a model of the kind ~ batch + condition.

This is why I would like to use the technical replicates profiled across batches to correct for the batch effect so that then I'd be able to fit a model of the kind ~ condition.

Would using duplicateCorrelation() and fitting a ~condition model be appropriate here?

2

No, batch is not perfectly confounded with condition, because condition B is in both batches and, more than that, the exact same samples are in both batches. Presumably, the whole purpose of repeating the condition B samples was in order to deconfound the batches.

What I have suggested to you does exactly what you say you want to do -- it uses the technical replicates to do the batch correction, but in a organic way rather than ad hoc.