Question

DESeq2 batch effects

0

Entering edit mode

igor ▴ 40

@igor

Last seen 10 months ago

United States

I realize that the topic of batch effect removal in RNA-seq has been addressed many times. I am not sure if this particular aspect of it was, but I may have missed it. I was looking at the DESeq2 vignette, which has grown substantially over the last few years. Specifically, there is a note:

If there is unwanted variation present in the data (e.g. batch effects) it is always recommend to correct for this, which can be accommodated in DESeq2 by including in the design any known batch variables or by using functions/packages such as svaseq in sva (Leek 2014) or the RUV functions in RUVSeq (Risso et al. 2014) to estimate variables that capture the unwanted variation. In addition, the ashr developers have a specific method for accounting for unwanted variation in combination with ashr (Gerard and Stephens 2017).

Then later:

It is possible to visualize the transformed data with batch variation removed, using the removeBatchEffect function from limma. This simply removes any shifts in the log2-scale expression data that can be explained by batch.

I suppose one section addresses known and the other unknown variables. I am curious if there is a reason why multiple alternatives were mentioned in one, but not in the other (ComBat-seq, for example). Is it just for simplicity or is removeBatchEffect the most optimal approach?

BatchEffect DESeq2 RNASeq • 1.3k views

ADD COMMENT • link updated 3.5 years ago by Michael Love 41k • written 3.5 years ago by igor ▴ 40

score 1 · Accepted Answer · 2020-11-11

The first section is about including estimated nuisance variables (e.g. SVs or factors of UV) in the design. In the second section, it is talking about removing known batch from VST data, but the other packages already have functions I believe that will produce log scale data with the nuisance variation removed. Note that these are different purposes though, the first section is for use with DESeq() and the second section is for PCA, heatmaps, and other visualizations.