**120**wrote:

Hi,

I would like to use the duplicateCorrelation function along with limma and voom in order to analyze some RNA-seq data with blocking and group-means parametrization (similar to example 9.7 in the limma manual). The manual does not contain a similar example using the voom function for RNA-seq data. I searched the Bioconductor mailing list

(query<https: encrypted.google.com="" search?hl="en&q=voom%20site%3Ahttps" %3a%2f%2fstat.ethz.ch%2fpipermail%2fbioconductor%2f#hl="en&q=voom+dupli" catecorrelation+site:https:%2f%2fstat.ethz.ch%2fpipermail%2fbioconduct="" or%2f="">)

and BioStars

(query<https: encrypted.google.com="" search?hl="en&q=voom%2" 0duplicatecorrelation%20site%3abiostars.org="">)

and found conflicting advice.

The first approach performs voom only once, prior to running duplicateCorrelation, e.g. this Bioc post

https://stat.ethz.ch/pipermail/bioconductor/2013-July/054087.html

and this Biostars post

https://www.biostars.org/p/54565

The second approach performs voom twice, both before and after running duplicateCorrelation, e.g. this Bioc post

https://stat.ethz.ch/pipermail/bioconductor/attachments/20130530/4dcc9475/attachment.pl

and this Biostars post

https://www.biostars.org/p/96889

Is there a consensus on which is the better method? Does the choice depend on the study design?

In both my actual data and in the simulated example below, I have found the results to be very similar. Thus I am currently leaning towards using the single voom method since it achieves similar results with fewer steps. The simulated example is not very interesting because blocking does not have much effect in the first place (corfit$consensus = -0.002984798), as expected from random data. However, in my data set it does make a difference (corfit$consensus = 0.3433403) and the results of using single or double voom are still similar.

Thanks,

John

library("limma") library("edgeR") # Simulate data set.seed(123) counts <- matrix(rpois(9000, lambda = 500), ncol = 90) rownames(counts) <- paste0("g_", 1:nrow(counts)) anno <- data.frame(block = rep(paste0("block", 1:6), each = 15), fac1 = rep(c("A", "B", "C"), 30), fac2 = rep(paste0("treat", 1:5), 18)) combined_fac <- factor(paste(anno$fac1, anno$fac2, sep = ".")) design <- model.matrix(~0 + combined_fac) colnames(design) <- levels(combined_fac) # Single voom y <- DGEList(counts) y <- calcNormFactors(y) v <- voom(y, design) corfit <- duplicateCorrelation(v, design, block = anno$block) fit <- lmFit(v, design, block = anno$block, correlation = corfit$consensus) cont_matrix <- makeContrasts(AvB_1 = A.treat1 - B.treat1, AvB_2 = A.treat2 - B.treat2, levels = design) fit2 <- contrasts.fit(fit, cont_matrix) fit2 <- eBayes(fit2) result_single <- topTable(fit2, number = nrow(counts), sort.by = "none") # Double voom y <- DGEList(counts) y <- calcNormFactors(y) v <- voom(y, design) corfit <- duplicateCorrelation(v, design, block = anno$block) v <- voom(y, design, block = anno$block, correlation = corfit$consensus) fit <- lmFit(v, design, block = anno$block, correlation = corfit$consensus) cont_matrix <- makeContrasts(AvB_1 = A.treat1 - B.treat1, AvB_2 = A.treat2 - B.treat2, levels = design) fit2 <- contrasts.fit(fit, cont_matrix) fit2 <- eBayes(fit2) result_double <- topTable(fit2, number = nrow(counts), sort.by = "none") # No blocking y <- DGEList(counts) y <- calcNormFactors(y) v <- voom(y, design) fit <- lmFit(v, design) cont_matrix <- makeContrasts(AvB_1 = A.treat1 - B.treat1, AvB_2 = A.treat2 - B.treat2, levels = design) fit2 <- contrasts.fit(fit, cont_matrix) fit2 <- eBayes(fit2) result_no_block <- topTable(fit2, number = nrow(counts), sort.by = "none") # Comparison - all correlations > 0.99 cor(result_single$adj.P.Val, result_double$adj.P.Val) cor(result_single$adj.P.Val, result_no_block$adj.P.Val) cor(result_double$adj.P.Val, result_no_block$adj.P.Val)

**32k**• written 3.5 years ago by John Blischak •

**120**