Question

Can SVs identified from sva be used for removeBatchEffect() in DESeq2

0

Entering edit mode

Xiao • 0

@821e15d0

Last seen 12 weeks ago

United States

Hi community,

I used sva on DESeq2 normalized count.

mod=model.matrix(~group, colData(dds))

mod0=model.matrix(~1, colData(dds))

sv=svaseq(counts(dds, normalized=TRUE), mod, mod0)

Is it recommended to use these SVs in removeBatchEffect(), and how should it be done if it is recommended?

Thanks.

sva DESeq2 BatchEffect limma • 523 views

ADD COMMENT • link updated 3 months ago by ATpoint ★ 4.0k • written 3 months ago by Xiao • 0

score 0 · Answer 1 · 2024-01-05

0

Entering edit mode

ATpoint ★ 4.0k

@atpoint-13662

Last seen 1 day ago

Germany

I feel like this has been asked and answered many times before. Generally, the recommendations for DESeq2 (and edgeR/limma, see their manuals and many posts on this here on the support site) is to include covariates (like batch, surrogate variables, anything...) into the design rather than using direct adjustment methods.

That means, you would add the SVs to colData(dds) and then do ~SV1+SV2+SV(n)+your_actual_design. You could use removeBatchEffect() on the output of vst or rlog to visualize / approximate the impact of adding these variables, e.g. by doing PCA with and without correction. But this is not input for DESeq2 as it uses raw counts. The removeBatchEffect does not return raw or integer counts, so it's not compatible.

See for example https://github.com/mikelove/preNivolumabOnNivolumab/blob/main/preNivolumabOnNivolumab.knit.md which is a little case study from Mike Love using RUVseq (same principles apply for sva) with DESeq2.

ADD COMMENT • link 3 months ago ATpoint ★ 4.0k

0

Entering edit mode

Thanks ATpoint for your comments. I totally agree with the way how the SVs should be included in the design matrix.

My question is more on removeBatchEffect(). If I understand what you've said correctly, assuming normTransform() is used, the correction step should be

mod=model.matrix(~group, data=colData(dds))

mod0=model.matrix(~SV1+SV2+SV(n), data=colData(dds))

ntd=normTransform(dds)

ntd_cor=ntd

assay(ntd_cor)=limma::removeBatchEffect(assay(ntd), covariates=mod0, design=mod)

.

Since the SVs is calculated based on the normalized count, as suggested in here, is it more appropriate to use removeBatchEffect() based on the normalized count and then return to log scale, such as

assay(ntd_cor)=log2(limma::removeBatchEffect(counts(dds, normalized=TRUE), covariates=mod0, design=mod)+1)

?

ADD REPLY • link 3 months ago Xiao • 0

0

Entering edit mode

assay(ntd_cor)=limma::removeBatchEffect(assay(ntd), covariates=mod0, design=mod)

...should work. No need for log2, in- and output of the function is log2.

ADD REPLY • link 3 months ago ATpoint ★ 4.0k

0

Entering edit mode

Sorry about the typo.

It is indeed

assay(ntd_cor)=limma::removeBatchEffect(assay(ntd), covariates=mod0, design=mod)

.

Do you think it necessary to apply removeBatchEffect() with the expression data and covariates in the same scale? I am a bit concerned here since assay(ntd) is in log2 scale while the covariates are in the normalized count scale.

ADD REPLY • link 3 months ago Xiao • 0

0

Entering edit mode

I later find that removeBatchEffect() assumes log-expression values as input. So it should be fine to use

assay(ntd_cor)=limma::removeBatchEffect(assay(ntd), covariates=mod0, design=mod)

as you've suggested.

ADD REPLY • link 3 months ago Xiao • 0

0

Entering edit mode

Yes, see the help page for this function and the DESeq2 vignette which covers this.

ADD REPLY • link 3 months ago ATpoint ★ 4.0k