Question

Extracting normalized DESeq2 counts for a multi-factor design?

0

Entering edit mode

feargalr • 0

@feargalr-7794

Last seen 8.6 years ago

European Union

Hello,

In a multi-factor DESeq2 design, it accounts for the changes in one group while testing for changes in others. In section 1.5 of the vignette for example it controls for type while testing for differences in condition. I assume it calculates a normalization factor + size factor and then adjusts the counts accordingly?

In that example in the vignette is it possible to access the counts which have been normalized for type and size factor?

Thanks

deseq2 • 3.7k views

ADD COMMENT • link updated 8.9 years ago by Michael Love 41k • written 8.9 years ago by feargalr • 0

score 1 · Answer 1 · 2015-05-28

1

Entering edit mode

Michael Love 41k

@mikelove

Last seen 12 hours ago

United States

You're right than size factor/normalization factor (s_ij) and the factor effects are similar, we have (for a standard model):

log(E(count)) = s_ij 2^(beta_intercept + beta_type + beta_cond) (1)

so we could also rearrange terms and write this:

log(E(count)) = s_ij s_type 2^(beta_intercept + beta_cond) (2)

...where s_type = 2^beta_type.

But in the software and model, we keep the type effect in the exponent, as in (1).

I have it on my list of todos, to allow for easier plotting of normalized counts, removing the calculated group/batch effects (like "type" here), to make it easier to see the condition effect for example. But this is not yet implemented. I'll ping this thread when I've added something to the devel branch.

ADD COMMENT • link 8.9 years ago Michael Love 41k

0

Entering edit mode

Awesome, thanks!

ADD REPLY • link 8.9 years ago feargalr • 0

0

Entering edit mode

Michael,

How can I modify the model and extract batch effect-normalized counts in R commands? Or do I have to overwrite some implemented functions?

Tom

ADD REPLY • link 7.6 years ago Tom ▴ 10

0

Entering edit mode

See here:

DESeq2 - Acquiring batch-corrected values for PCA and hierarchical clustering

ADD REPLY • link 7.6 years ago Michael Love 41k

0

Entering edit mode

Thank you for your reply.

When I do removeBatchEffect(), should the 'batch' function be set, i.e. removeBatchEffect(rlogMat, batch=MyBatch)?

With 'batch=NULL', I only got the same matrix as the input rlogMat, and with 'batch=MyBatch', the output matrix was somewhat different.

Because, as far as I understand, rlog/vst are the values already normalized for mean shifts, I am wondering whether setting 'batch=MyBatch' for removeBatchEffect() would be double counting?

Thanks,

ADD REPLY • link 7.6 years ago Tom ▴ 10

0

Entering edit mode

Yes, you should provide the variable, for which you want removeBatchEffect() to remove associated variation.

No, rlog and VST do not normalize for mean shifts associated with variables in the design. This is explained in the vignette in the section on transformations and in the man pages for the functions.

The only information that uses the design is the global trend (the trend across all genes) of estimated dispersions over the mean (when blind=FALSE).

ADD REPLY • link 7.6 years ago Michael Love 41k

0

Entering edit mode

Thanks, I was assuming rlog/vst normalize batch-based trend when blind=FALSE.

If the data has obvious batch effect and I want to do downstream analyses such as unsupervised clustering with batch effect-free expression values, would removeBatchEffect() output be more appropriate than rlogMat?

ADD REPLY • link 7.6 years ago Tom ▴ 10

0

Entering edit mode

I'm suggesting you pass the VST or rlog transformed matrix to removeBatchEffect().

ADD REPLY • link 7.6 years ago Michael Love 41k