limma: comparison of repeat measurements - blocking or within-subject correlation?
2
0
Entering edit mode
Jane ▴ 10
@jkhudyakov-23010
Last seen 17 months ago
United States

I am using limma to identify proteins that are differentially expressed in a tissue collected from four subjects at two different stages. I wanted to account for repeated sampling from the same individual.

I tried 1) blocking by subject (p. 43-44 of limma manual) and 2) computing within-subject correlation and including it in the model (p. 111 of limma manual). The consensus duplicate correlation for my dataset was 0.364, which seemed fairly high. I found 102 DEPs using approach 1 and 124 DEPs using approach 2, of which 101 were shared. Approach 1 identified 1 unique DEP and approach 2 found 23 unique DEPs not identified by approach 1.

Since the results are so similar, I am wondering which is the right approach? It seems that correcting for within-subject correlation slightly increases statistical power? Any insights would be much appreciated.

For reference, here is the code I used: Approach 1

design <- model.matrix(~0+Subject+Stage, data=expset)
fit <- lmFit(expset, design)
cm <- makeContrasts(LatevEarly=late-early, levels=design)
fit2 <- contrasts.fit(fit, cm)
fit2 <- eBayes(fit2,trend=TRUE, robust=TRUE)
topTable(fitC2, adjust="BH", p.value = 0.05)

Approach 2

design <- model.matrix(~0+stage, data=expset)
corfit <- duplicateCorrelation(expset,design,block=expset$Subject)
fit <- lmFit(expset,design,block=subject,correlation=corfit$consensus)
cm <- makeContrasts(LatevEarly=late-early, levels=design)
fit2 <- contrasts.fit(fit, cm)
fit2 <- eBayes(fit2,trend=TRUE, robust=TRUE)
topTable(fit2, adjust="BH", p.value = 0.05)
limma repeated measurements blocking random effects • 2.7k views
ADD COMMENT
2
Entering edit mode
@gordon-smyth
Last seen 5 minutes ago
WEHI, Melbourne, Australia

Without knowing the details of your experiment, it would appear that your experiment is such that both approaches are valid. Putting the block effect in the design matrix is safer when there are large differences between the blocks. The duplicateCorrelation approach is better when the blocks are very unbalanced or are confounded with treatments. However there is a large area of overlap, i.e., there are many experiments for which both approaches are valid and give similar results. In these circumstances, I tend to give preference to the design matrix approach because it is more conservative.

ADD COMMENT
0
Entering edit mode

Thank you, Gordon! I will go with the more conservative approach.

ADD REPLY
0
Entering edit mode
@mikhaelmanurung-17423
Last seen 2.5 years ago
Netherlands

Using duplicateCorrelation would be the better choice compared to blocking. Are you analysing microarray or RNA-Seq data? If it is RNA-Seq then you are missing a few steps prior to `duplicateCorrelation. See this post as a reference https://support.bioconductor.org/p/59700/.

Note that duplicateCorrelation should be calculated twice as in https://support.bioconductor.org/p/114663/.

ADD COMMENT
0
Entering edit mode

I am analyzing shotgun proteomics data and using log2-transformed protein abundance values.

ADD REPLY

Login before adding your answer.

Traffic: 651 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6