Hi,
I am doing DEG analysis from RNA seq experiment.
I have two technical replicates originating from different libraries of the same sample.
How should I deal with them?Do I have to merge BAM files before DEG analyses?
Sara
Hi,
I am doing DEG analysis from RNA seq experiment.
I have two technical replicates originating from different libraries of the same sample.
How should I deal with them?Do I have to merge BAM files before DEG analyses?
Sara
In general, if you have technical replicates that were run in consistent batches (e.g., you ran all replicates in one batch, then decided you didn't have enough depth, so ran all the samples again), you can just sum the counts/gene for the replicates and then do the analysis. There probably isn't much to be gained by keeping them separate.
Thank you.
However I have a technical replicate only for one of the 12 samples. Do you think this is a problem?
I think it depends, doesn't it? Say the first set of samples were run 6 months ago, and then recently they decided to 'bump up' one of the samples with an additional run.
Given that there is some evidence for batch effects due to different library preps, different lanes, different sequencers, etc, the technical replicate might be sufficiently different that one might not want to simply combine. Or maybe it's fine. But I would want to take a closer look before combining if it's just one of the samples.
Yes, that's right in general. But you've already warned OP that the tech reps should be consistent, and Aaron has already advised OP how to check whether that is so. This is a followup question by OP about whether a technical replicate is a problem just because there is only one. And the answer to that is no.
Anyway, in a case like this, batch correct is impossible, so the only choices are to throw out one of the tech replicates or to pool them. I think the default action would be to pool unless there is a definite reason to suspect that one of the tech reps is much less reliable than the other.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I agree with James; summing the counts is what we usually do for technical replicates. In fact, there's a function named
sumTechReps
for this purpose inedgeR
. If you want to be sure of the technical nature of these replicates, you could estimate the negative binomial dispersion between them; the estimate should be pretty close to zero if the counts follow a Poisson sampling distribution (see Marioni et al.'s 2008 Genome Research paper).