Question

Handling adjustment variables and surrogate variables in model

0

Entering edit mode

rbutler • 0

@rbutler-20667

Last seen 4.1 years ago

In my dataset, I have a covariate that I wish to correct for cell and I also want to incorporate a surrogate variable SV1 for hidden batch effects. My question is if I pass cell as an adjustment variable to sva, and generate SV1, do I now remove cell from my model design?

# Generate surrogate variable for hidden batch effect
dds = DESeq(dds, parallel=T)
vst = vst(dds)
dat = assay(vst)
mod = model.matrix(~ cell + day + trt, colData(dds))
mod0 = model.matrix(~ cell, colData(dds))
n.sv = num.sv(dat,mod,method="leek") # is 1
svseq = svaseq(dat, mod, mod0, n.sv=n.sv)
dds$SV1 = svseq$sv[,1]

# Is this right?
dds1 = dds
design(dds1) = ~ SV1 + day + trt
dds1 = DESeq(dds1, parallel=T)
vsd1 = vst(dds1)

# Or this?
dds2 = dds
design(dds2) = ~ SV1 + cell + day + trt
dds2 = DESeq(dds2, parallel=T)
vsd2 = vst(dds2)

# having a look see
plotPCA(vst, "cell")
assay(vsd1) = limma::removeBatchEffect(assay(vsd1), batch=vsd1$SV1)
plotPCA(vsd1, "cell") # it definitely modifies cell clustering
assay(vsd2) = limma::removeBatchEffect(assay(vsd2), batch=vsd2$cell, 
                                        covariate=vsd2$SV1)
plotPCA(vsd2, "cell") # doesn't look right

Would it be better to not pass cell to sva as an adjustment variable, and leave it in the model?

mod = model.matrix(~ cell + day + trt, colData(dds))
mod0 = model.matrix(~ 1, colData(dds))
...
design(dds1) = ~ SV1 + cell + day + trt

sva deseq2 • 1.1k views

ADD COMMENT • link updated 4.6 years ago by James W. MacDonald 65k • written 4.6 years ago by rbutler • 0

score 0 · Answer 1 · 2019-09-12

The goal for SVA is to generate surrogate variables for unobserved technical effects. So while you could pretend you don't know about the cell phenotype and use sva to estimate one or more surrogate variables to control for that phenotype, it seems odd to do so. Normally one uses ComBat for that sort of thing.

That said, it's your analysis, and since you are doing the analysis the responsibility for any choices made in the analysis is yours as well. Asking randos on the internet for analysis advice is probably suboptimal (this support site is really intended to help with technical aspects of using the software, not advice as to how best to do an analysis).