Extracting normalized DESeq2 counts for a multi-factor design?
1
0
Entering edit mode
feargalr • 0
@feargalr-7794
Last seen 7.1 years ago
European Union

Hello, 

In a multi-factor DESeq2 design, it accounts for the changes in one group while testing for changes in others. In section 1.5 of the vignette for example it controls for type while testing for differences in condition. I assume it calculates a normalization factor + size factor and then adjusts the counts accordingly?

In that example in the vignette is it possible to access the counts which have been normalized for type and size factor? 

Thanks

deseq2 • 3.2k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 7 hours ago
United States

You're right than size factor/normalization factor (s_ij) and the factor effects are similar, we have (for a standard model):

log(E(count)) = s_ij 2^(beta_intercept + beta_type + beta_cond) (1) 

so we could also rearrange terms and write this:

log(E(count)) = s_ij s_type 2^(beta_intercept + beta_cond) (2)

...where s_type = 2^beta_type.

But in the software and model, we keep the type effect in the exponent, as in (1). 

I have it on my list of todos, to allow for easier plotting of normalized counts, removing the calculated group/batch effects (like "type" here), to make it easier to see the condition effect for example. But this is not yet implemented. I'll ping this thread when I've added something to the devel branch.

ADD COMMENT
0
Entering edit mode

Awesome, thanks!

ADD REPLY
0
Entering edit mode

Michael,

How can I modify the model and extract batch effect-normalized counts in R commands? Or do I have to overwrite some implemented functions?

Tom

ADD REPLY
0
Entering edit mode

Thank you for your reply.

When I do removeBatchEffect(), should the 'batch' function be set, i.e. removeBatchEffect(rlogMat, batch=MyBatch)?

With 'batch=NULL', I only got the same matrix as the input rlogMat, and with 'batch=MyBatch', the output matrix was somewhat different.

Because, as far as I understand, rlog/vst are the values already normalized for mean shifts, I am wondering whether setting 'batch=MyBatch' for removeBatchEffect() would be double counting?

Thanks,

ADD REPLY
0
Entering edit mode

Yes, you should provide the variable, for which you want removeBatchEffect() to remove associated variation.

No, rlog and VST do not normalize for mean shifts associated with variables in the design. This is explained in the vignette in the section on transformations and in the man pages for the functions.

The only information that uses the design is the global trend (the trend across all genes) of estimated dispersions over the mean (when blind=FALSE).

ADD REPLY
0
Entering edit mode

Thanks, I was assuming rlog/vst normalize batch-based trend when blind=FALSE.

If the data has obvious batch effect and I want to do downstream analyses such as unsupervised clustering with batch effect-free expression values, would removeBatchEffect() output be more appropriate than rlogMat?

ADD REPLY
0
Entering edit mode
I'm suggesting you pass the VST or rlog transformed matrix to removeBatchEffect().
ADD REPLY

Login before adding your answer.

Traffic: 308 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6