RUVSEq and VSD integration
1
0
Entering edit mode
Mattia ▴ 10
@mattia-9769
Last seen 5.5 years ago
Milano

Hi,

I would create a matrix with normalized and "batch-free" values for downstream analysis (not Differential Expression Analysis but Machine Learning classification analysis). I would also take into account potential "unwanted variation" in counts data.

My idea is to combine functions from DESeq2 ('vsd' in particular) and RUVSeq packages.

Let's say I obtained 4 continuous factors of unwanted variation (W_1,  W_2, W_3, W_4) from RUVg function implemented in RUVSeq. I was thinking to follow one of this 2 approaches:

Appr. 1):

dds <- DESeqDataSetFromMatrix(countData = rowCountsOfExpressedGenes, colData = colData, design = ~ conditions)
dds <- DESeq(dds) # to estimated all DESeq parameters
vsd <- varianceStabilizingTransformation(dds)
covar <- cbind(set@phenoData@data$W_1, set@phenoData@dataW_2, setk@phenoData@dataW_3, setk@phenoData@dataW_4)
vsd_nobatch <-removeBatchEffect(assay(vsd), design = model.matrix(~conditions), covariates = covar)

This is similar to A: How do I extract read counts from DESeq2

OR

Appr. 2)

dds <- DESeqDataSetFromMatrix(countData = rowCountsOfExpressedGenes, colData = colData , design = ~ W_1 + W_2 + W_3 + W_4 + conditions)
dds <- DESeq(dds)
vsd_nobatch <- varianceStabilizingTransformation(dds , blind = FALSE)

My questions are:

1) Can I integrate factors of unwanted variation obtained with RUVSeq (Singular Value Decomposition technique) with DESeq normalization?

2) If yes, should I use Appr 1 or Appr 2 or another approach?

Thanks in advance,

Mattia.

 

deseq2 ruvseq covariates continuous vsd • 2.1k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 1 day ago
United States

varianceStabilizingTransformation() with blind=FALSE doesn't use the design to remove mean shifts associated with covariates. It's a bit hard to explain (I try to explain in more depth in the vignette section on transformations), but it only uses the design to estimate dispersions, and then uses the global trend of dispersion for formulating the transformation, which is then applied to size-factor-adjusted counts. But this is nothing like removeBatchEffect(), which is removing mean shifts associated with the covariates.

So I would recommend #1 if you want to remove effects associated with W1-W4 from variance stabilized data for downstream machine learning / classification tasks.

ADD COMMENT
0
Entering edit mode

Thanks a lot Michael for your quick and very usefull reply.

Mattia.

ADD REPLY

Login before adding your answer.

Traffic: 598 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6