Experimental design

Question

Multi-level design RRBS analysis with voomLmFit

1

Entering edit mode

altintas.ali • 0

@altintasali-11054

Last seen 29 days ago

Denmark

Hi all,

I have a multi-level design for an RRBS experiment, which is identical to the multi-level design mentioned in Section 9.7 of limma's User Guide.

Experimental design

# Experimental design
meta <- data.frame(library = as.factor(1:12),
                   patient = as.factor(rep(1:6, each = 2)),
                   disease = rep(c("healthy", "sick"), each = 6),
                   tissue = rep(c("liver","muscle"), times = 6))
meta$group <- paste(meta$disease, meta$tissue, sep = ".")

#  library patient disease tissue          group
#        1       1 healthy  liver  healthy.liver
#        2       1 healthy muscle healthy.muscle
#        3       2 healthy  liver  healthy.liver
#        4       2 healthy muscle healthy.muscle
#        5       3 healthy  liver  healthy.liver
#        6       3 healthy muscle healthy.muscle
#        7       4    sick  liver     sick.liver
#        8       4    sick muscle    sick.muscle
#        9       5    sick  liver     sick.liver
#       10       5    sick muscle    sick.muscle
#       11       6    sick  liver     sick.liver
#       12       6    sick muscle    sick.muscle

I was considering using voomLmFit (or alternatively voom together with duplicateCorrelation) in order to make comparisons across tissues and disease groups while blocking for patient IDs.

(I must admit that I got the idea of using voom from the DMRcate package as it internally uses voom).

Question 1: Do you think if it is ok to use voomLmFit (or voom together with duplicateCorrelation) for multi-level design RRBS analysis?

Well if the answer is no, then my code and questions below do not make sense, but let me give it a try anyways...

Toy dataset

Here is the toy data set for the RRBS experiment. Please note that _Me and _Un stands for "methylated" and "unmethylated" counts, respectively.

set.seed(123)
counts <- matrix(data = round(runif(n = 24000, min = 0, max = 100)),
                 ncol = nrow(meta)*2)
cols <- expand.grid(c("Me","Un"), meta$library)
colnames(counts) <- paste(cols[,2], cols[,1], sep = "_")
rownames(counts) <- paste("CpG", 1:nrow(counts), sep = "_")
head(counts)

Differentially methylated CpGs

Below is how I set up the parameters for voomLmFit.

library(edgeR)
design <- modelMatrixMeth(~0 + group, meta)
colnames(design) <- gsub("group", "", colnames(design))

v <- voomLmFit(counts = counts, 
               design = design, 
               block = rep(meta$patient, each = 2),
               sample.weights = TRUE)

Please not that I needed to use block = rep(meta$patient, each = 2) to match the blocking factor (meta$patient) for the count matrix (count). Also, I have used sample.weights = TRUE, which should be equivalent of using voomWithQualityWeights.

Question 2: Do the parameters make sense for voomLmFit especially considering that I needed to use block = rep(meta$patient, each = 2).

voomLmfit step is followed by contrast.fit and eBayes afterwards:

contrasts <- makeContrasts(
    healthyVSsick_liver = healthy.liver - sick.liver, 
    healthyVSsick_muscle = healthy.muscle - sick.muscle, 
    liverVSmuscle_healthy = healthy.liver - healthy.muscle,
    liverVSmuscle_sick = sick.liver - sick.muscle, 
    levels = design)
colnames(contrasts)

DMRs <- list()
for (i in colnames(contrasts)){
  fit <- contrasts.fit(v, contrasts = contrasts[,i])  
  fit <- eBayes(fit)
  DMRs[[i]] <- topTable(fit, 
                        coef = colnames(contrasts[,i]))
}
DMRs

Thank you so much for your time and insights in advance.

edgeR voomLmFit limma duplicateCorrelation voom • 2.2k views

ADD COMMENT • link updated 8 months ago by Gordon Smyth 52k • written 3.6 years ago by altintas.ali • 0

score 2 · Accepted Answer · 2021-11-24

2

Entering edit mode

Gordon Smyth 52k

@gordon-smyth

Last seen 14 hours ago

WEHI, Melbourne, Australia

I don't see any problems with the approach you've taken. Yes, you can use voomLmFit instead of edgeR in this way. ~~Yes, you can use duplicateCorrelation after expanding the blocking variable for the Me/Un structure of the data.~~

Later

As discussed later (Zero intra-block correlation in RRBS analysis using voomLmFit with block), the "Poisson-trick" approach taken by edgeR and limma for methylation data includes a conditioning term for each sample, which prevents the use of duplicateCorrelation or random blocks. The code you show will run but the intra-block correlation will be set to zero.

ADD COMMENT • link 3.6 years ago • updated 8 months ago Gordon Smyth 52k

0

Entering edit mode

Dear Gordon,

Thank you so much for your invaluable feedback. Now, I am more confident using voomLmFit for RRBS data.

Warmest regards!

ADD REPLY • link 3.6 years ago altintas.ali • 0