limma voom - time course experiment - multi level experiment
2
0
Entering edit mode
SPbeginner • 0
@spbeginner-15170
Last seen 5.7 years ago

Dear Bioconductor community,

I'm analyzing RNA-seq data with limma-voom.

This experiment involves 6 subjects, including 3 animals who dead after infection by a virus and 3 animal who survive.

From each subject, blood were collacted before any treatment (D0) and at different time points after infection (as animal are not dying the same day, we have missing time point for some animals). There is a single time point (Day 3) for which we have data in dead and alive animals. So, I would like to identify genes that respond differently at Day3 in Alive relative to the Dead animals.

The correlation is very low (0.18). Should I performed the analysis without duplicateCorrelation ?

What is the best way to analyse such data ?

Thanks in advance for your help

limma duplicatecorrelation limma voom • 1.5k views
ADD COMMENT
1
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 1 hour ago
The city by the bay

Your current approach seems sensible to me. A low correlation is not a problem - it just means that your batch effect is weak, which is a good thing. Indeed, as the correlation gets smaller and smaller, your results with duplicateCorrelation should converge to what you would get if you didn't use duplicateCorrelation at all. For example:

set.seed(12345)
a <- matrix(rnorm(100000), ncol=10)
group <- gl(2, 5)
batch <- gl(5, 2)

library(limma)
design <- model.matrix(~group)
fit1 <- lmFit(a, design)
fit1 <- eBayes(fit1)
topTable(fit1)

fit2 <- lmFit(a, design, correlation=0, block=batch)
fit2 <- eBayes(fit2)
topTable(fit2) # should be the same as above.

So you can see that having "too low" a correlation poses no danger to the analysis with duplicateCorrelation, because it will automatically converge to the analysis without. The only real disadvantage is that it'll take longer to run, but with 6 samples this is not really an issue. Besides, it is quite difficult to tell whether a correlation of 0.18 is big or small in terms of the ultimate effect of the p-value; so you might as well take it into account if you can.

ADD COMMENT
0
Entering edit mode

Thank you so much for your explanation. Now it's much more clear.

In my case, using duplicateCorrelation() blocking on each animal calculate the correlation measurements made on the same animal, this is why we do not expect a high value (due to experimental condition) is it ?

ADD REPLY
0
Entering edit mode

I don't understand your last question. From the description of your experimental design, I have no prior expectations whatsoever about the size of the correlation. If you have high animal-to-animal variability but a consistent response to time in each animal, then the consensus correlation will be high, as most of the variance in the data will be driven by the animal effect. Otherwise, the consensus correlation will be low; it just depends on how consistently your animals behave.

ADD REPLY
1
Entering edit mode
@gordon-smyth
Last seen 3 hours ago
WEHI, Melbourne, Australia

0.18 is not at all a low correlation in this context, in fact it is the sort of correlation one might expect for this sort of analysis. It will be have an effect on the p-values, so it isn't ignorable.

A low correlation would be something less than 0.01. As Aaron points out, it would still be fine to use duplicateCorrelation unless the correlation was actually negative.

Very high correlations (> 0.5) are not usual in this context. In such a case, I would consider using a blocked analysis instead.

ADD COMMENT

Login before adding your answer.

Traffic: 596 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6