Zero intra-block correlation in RRBS analysis using voomLmFit with block
1
0
Entering edit mode
@altintasali-11054
Last seen 23 hours ago
Denmark

Dear all,

This is just a follow-up question from an old post related to RRBS analysis using voomLmFit with block.

I am facing an issue where the intra-block correlation is consistently set to zero, meaning that no need to account for blocking, when running voomLmFit for RRBS analysis with a block factor (i.e., patient).

# Experimental design
meta <- data.frame(library = as.factor(1:12),
                   patient = as.factor(rep(1:6, each = 2)),
                   disease = rep(c("healthy", "sick"), each = 6),
                   tissue = rep(c("liver","muscle"), times = 6))
meta$group <- paste(meta$disease, meta$tissue, sep = ".")

# Toy data
set.seed(123)
counts <- matrix(data = round(runif(n = 24000, min = 0, max = 100)),
                 ncol = nrow(meta)*2)
cols <- expand.grid(c("Me","Un"), meta$library)
colnames(counts) <- paste(cols[,2], cols[,1], sep = "_")
rownames(counts) <- paste("CpG", 1:nrow(counts), sep = "_")

# voomLmFit
library(edgeR)
design <- modelMatrixMeth(~0 + group, meta)
colnames(design) <- gsub("group", "", colnames(design))

v <- voomLmFit(counts = counts, 
               design = design, 
               block = rep(meta$patient, each = 2),
               sample.weights = TRUE)

# First sample weights (min/max) 0.8816726/1.2959486
# First intra-block correlation  0
# Final sample weights (min/max) 0.8703724/1.3442574
# Final intra-block correlation  0

At first, I suspected that the simulated data in this example might have led to zero intra-block correlation, but I observed the same behavior with real-life data. Running voomWithQualityWeights and duplicateCorrelation manually confirms this issue, along with the following warning message:

Block factor already encoded in the design matrix: setting intrablock correlation to zero.

Replicating the voomLmFit by running voomWithQualityWeights and duplicateCorrelation twice confirmed this behavior:

vwqw <- voomWithQualityWeights(counts = counts, 
                               design = design)
corfit <- duplicateCorrelation(object = vwqw, 
                               design = design, 
                               block = rep(meta$patient, each = 2)) 

# Warning message:
# In duplicateCorrelation(object = vwqw, design = design, block = rep(meta$patient,  :
#   Block factor already encoded in the design matrix: setting intrablock correlation to zero.

vwqw <- voomWithQualityWeights(counts = counts, 
                               design = design, 
                               block = rep(meta$patient, each = 2), 
                               correlation = corfit$consensus.correlation)
corfit <- duplicateCorrelation(object = vwqw, 
                               design = design, 
                               block = rep(meta$patient, each = 2)) 

# Warning message:
# In duplicateCorrelation(object = vwqw, design = design, block = rep(meta$patient,  :
#   Block factor already encoded in the design matrix: setting intrablock correlation to zero.

In fact, the Sample (library) is nested within the block (patient) inside the methylation design matrix due to the way modelMatrixMeth is created. Therefore, the warning message makes sense.

head(design)

#   Sample1 Sample2 Sample3 Sample4 Sample5 Sample6 Sample7 Sample8 Sample9 Sample10 Sample11 Sample12 healthy.liver healthy.muscle
# 1       1       0       0       0       0       0       0       0       0        0        0        0             1              0
# 2       1       0       0       0       0       0       0       0       0        0        0        0             0              0
# 3       0       1       0       0       0       0       0       0       0        0        0        0             0              1
# 4       0       1       0       0       0       0       0       0       0        0        0        0             0              0
# 5       0       0       1       0       0       0       0       0       0        0        0        0             1              0
# 6       0       0       1       0       0       0       0       0       0        0        0        0             0              0
#   sick.liver sick.muscle
# 1          0           0
# 2          0           0
# 3          0           0
# 4          0           0
# 5          0           0
# 6          0           0

Questions

Q1: Is there a correct approach to account for block (i.e., patient) in voomLmFit, given the warning about the block factor being encoded in the design matrix? For example, would it make sense to use design from model.matrix instead of modelMatrixMeth while running the duplicateCorrelation?

design_normal <- model.matrix(~0 + group, meta)
design_normal <- design_normal[rep(seq_len(nrow(design_normal)), each = 2), ] #replicate each row twice
design_methyl <- modelMatrixMeth(~0 + group, meta)

vwqw <- voomWithQualityWeights(counts = counts, 
                               design = design_methyl, # design for RRBS
                               block = rep(meta$patient, each = 2), 
                               correlation = corfit$consensus.correlation)
corfit <- duplicateCorrelation(object = vwqw, 
                               design = design_normal, # regular design
                               block = rep(meta$patient, each = 2))

Q2: Should the warning message "Block factor already encoded in the design matrix: setting intrablock correlation to zero." from duplicateCorrelation also be passed to voomLmFit? Is the current behavior intentional?

Any insights or suggestions would be greatly appreciated. Thank you so much for your help in advance.

limma duplicateCorrelation edgeR modelMatrixMeth voomLmFit • 99 views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 5 hours ago
WEHI, Melbourne, Australia

Thanks for pointing out this issue. Yes, the fact that limma is using sample effects to convert the count data into proportion data means that it is not possible to estimate a block correlation at the same time. My previous advice to you about this was too simple.

No, you can't just double the rows of the "normal" design matrix, as in your code suggestion. The fact that the methylated and unmethylated counts for the same sample are negatively correlated will overwhelm the intrablock correlation. The code you give will yield a correlation that is too low and possibly negative.

On the other hand, I think you could estimate the intrablock correlation by fitting the "normal" design matrix to the methylated counts alone, i.e., just subset the counts matrix to every second column, and then feed the resulting correlation into the full methylation analysis with both methylated and unmethylated columns. This is essentially a statistical research idea though. I think it should work but I have not tried it or done any checking.

I am have been suppressing warnings from duplicateCorrelation() to voomLmFit() because I didn't want the warnings to appear more than once as voomLmFit() iterates. I have now made an update to voomLmFit() on the developmental version of edgeR so that the duplicateCorrelation() warning will be passed on by voomLmFit(), but just once.

ADD COMMENT

Login before adding your answer.

Traffic: 335 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6