Question

limma: comparison of repeat measurements - blocking or within-subject correlation?

0

Entering edit mode

Jane ▴ 10

@jkhudyakov-23010

Last seen 2.5 years ago

United States

I am using limma to identify proteins that are differentially expressed in a tissue collected from four subjects at two different stages. I wanted to account for repeated sampling from the same individual.

I tried 1) blocking by subject (p. 43-44 of limma manual) and 2) computing within-subject correlation and including it in the model (p. 111 of limma manual). The consensus duplicate correlation for my dataset was 0.364, which seemed fairly high. I found 102 DEPs using approach 1 and 124 DEPs using approach 2, of which 101 were shared. Approach 1 identified 1 unique DEP and approach 2 found 23 unique DEPs not identified by approach 1.

Since the results are so similar, I am wondering which is the right approach? It seems that correcting for within-subject correlation slightly increases statistical power? Any insights would be much appreciated.

For reference, here is the code I used: Approach 1

design <- model.matrix(~0+Subject+Stage, data=expset)
fit <- lmFit(expset, design)
cm <- makeContrasts(LatevEarly=late-early, levels=design)
fit2 <- contrasts.fit(fit, cm)
fit2 <- eBayes(fit2,trend=TRUE, robust=TRUE)
topTable(fitC2, adjust="BH", p.value = 0.05)

Approach 2

design <- model.matrix(~0+stage, data=expset)
corfit <- duplicateCorrelation(expset,design,block=expset$Subject)
fit <- lmFit(expset,design,block=subject,correlation=corfit$consensus)
cm <- makeContrasts(LatevEarly=late-early, levels=design)
fit2 <- contrasts.fit(fit, cm)
fit2 <- eBayes(fit2,trend=TRUE, robust=TRUE)
topTable(fit2, adjust="BH", p.value = 0.05)

limma repeated measurements blocking random effects • 3.4k views

ADD COMMENT • link updated 5.8 years ago by Gordon Smyth 53k • written 5.8 years ago by Jane ▴ 10

score 2 · Answer 1 · 2020-03-01

2

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 2 hours ago

WEHI, Melbourne, Australia

Without knowing the details of your experiment, it would appear that your experiment is such that both approaches are valid. Putting the block effect in the design matrix is safer when there are large differences between the blocks. The duplicateCorrelation approach is better when the blocks are very unbalanced or are confounded with treatments. However there is a large area of overlap, i.e., there are many experiments for which both approaches are valid and give similar results. In these circumstances, I tend to give preference to the design matrix approach because it is more conservative.

ADD COMMENT • link 5.8 years ago Gordon Smyth 53k

0

Entering edit mode

Thank you, Gordon! I will go with the more conservative approach.

ADD REPLY • link 5.8 years ago Jane ▴ 10

score 0 · Answer 2 · 2020-03-01

0

Entering edit mode

mikhael.manurung ▴ 280

@mikhaelmanurung-17423

Last seen 3.5 years ago

Netherlands

Using duplicateCorrelation would be the better choice compared to blocking. Are you analysing microarray or RNA-Seq data? If it is RNA-Seq then you are missing a few steps prior to `duplicateCorrelation. See this post as a reference https://support.bioconductor.org/p/59700/.

Note that duplicateCorrelation should be calculated twice as in https://support.bioconductor.org/p/114663/.

ADD COMMENT • link 5.8 years ago mikhael.manurung ▴ 280

0

Entering edit mode

I am analyzing shotgun proteomics data and using log2-transformed protein abundance values.

ADD REPLY • link 5.8 years ago Jane ▴ 10