Dear all,
I would like to adjust my whole-blood RNA-Seq count data matrix for cell type composition (obtained from hematological analysis & flow cytometry) before doing a coexpression network analysis with WGCNA
.
So far, I did the following:
# I use DESeq2's vst to remove mean-variance relationship in the data
dds <- DESeq2::DESeqDataSetFromMatrix(counts, colData, design = ~ group)
dds <- DESeq2::vst(dds, blind = TRUE)
vst <- assay(dds)
# adjust for confounding variables
vst_adjusted <- limma::removeBatchEffect(
x = vst,
covariates = c(cellA, cellB, cellC) # numeric vectors containing scaled cell proportion
)
However, according to this link from other forum I can apparently insert the covariates into the design matrix when making the DESeqDataSet
and then set blind = FALSE
during the variance-stabilizing transformation.
There are also those who recommend using ComBat
from sva
by inserting my covariates to the mod
parameter.
Which one is the best way for my goal?
Thank you for your kind response.
Best regards, Mikhael
Dear Peter,
Thank you for your prompt response. For
empiricalBayesLM
, would you advise feeding the group variables into theretainedCovariates
argument?Best, Mikhael
Retained covariates are those whose effect you want to preserve. My understanding is that you want to remove the cell type abundance/composition information, not retain it; removed variables should go into the removedCovariates argument.