Entering edit mode
Djie Tjwan Thung
▴
90
@djie-tjwan-thung-5053
Last seen 10.2 years ago
Dear Gordon Smyth, bioconductor list,
In the limma package we found a function called, removeBatchEffect,
which
removes the effects of batch effects or other technical variables on a
gene
expression matrix. The code of this removeBatchEffect is as follows:
function (x, batch, batch2 = NULL, design = matrix(1, ncol(x), 1)) {
x <- as.matrix(x)
batch <- as.factor(batch)
contrasts(batch) <- contr.sum(levels(batch))
if (is.null(batch2)) {
X <- model.matrix(~batch)[, -1, drop = FALSE]
}
else {
batch2 <- as.factor(batch2)
contrasts(batch2) <- contr.sum(levels(batch2))
X <- model.matrix(~batch + batch2)[, -1, drop = FALSE]
}
X <- qr.resid(qr(design), X)
t(qr.resid(qr(X), t(x)))
}
Now we have an expression matrix coming from a methylation array. As
bisulfite conversion efficiency (BSCE) is a known confounder for this
type
of array, wed like to adjust the matrix for the confounding effects
of
BSCE. BSCE is a numerical variable, however the removeBatchEffect
expects a
categorical variable as it handles the batch variable as a factor.
We were wondering if its valid to remove the following lines of code
in
this function regarding, batch and batch2, so the function treats
these
variables as numerical variables:
batch <- as.factor(batch)
contrasts(batch) <- contr.sum(levels(batch))
If someone is willing, you can test out this altered version by
loading the
.RData in the following link:
http://good.net/dl/au/fa43/contTechnicalVar.RData and using the
following
demo script:
###
#removeNumericBatchEffect(): altered version of removeBatchEffect(),
with the as.factor() and contrasts() lines commented out
#mod: model matrix containing variables of interest (intercept, age
and gender)
# head(mod) will give:
# (Intercept) as.factor(pheno$gender)v pheno$age
#1 1 1 55.9
#2 1 1 32.0
#create random intensity values for bisulfite conversion
bisulfite.conversion <- sample(6000:10000, 95)
#Run adjusted removeBatchEffect function
removeNumericBatchEffect(expr, batch = bisulfite.conversion, design =
mod)
Wed appreciate any input. Weve also considered ComBat from the sva
package as an alternative, but as this function also only accepts
categorical variables, this limma function seems like a better option
to
modify its code for our purpose.
Regards,
Djie Thung
Bioinformatics intern at UMC Utrecht
Dept. of Medical Genetics
[[alternative HTML version deleted]]