Question: Extracting normalized DESeq2 counts for a multi-factor design?
0
gravatar for feargalr
4.4 years ago by
feargalr0
European Union
feargalr0 wrote:

Hello, 

In a multi-factor DESeq2 design, it accounts for the changes in one group while testing for changes in others. In section 1.5 of the vignette for example it controls for type while testing for differences in condition. I assume it calculates a normalization factor + size factor and then adjusts the counts accordingly?

In that example in the vignette is it possible to access the counts which have been normalized for type and size factor? 

Thanks

deseq2 • 2.5k views
ADD COMMENTlink modified 4.4 years ago by Michael Love25k • written 4.4 years ago by feargalr0
Answer: Extracting normalized DESeq2 counts for a multi-factor design?
1
gravatar for Michael Love
4.4 years ago by
Michael Love25k
United States
Michael Love25k wrote:

You're right than size factor/normalization factor (s_ij) and the factor effects are similar, we have (for a standard model):

log(E(count)) = s_ij 2^(beta_intercept + beta_type + beta_cond) (1) 

so we could also rearrange terms and write this:

log(E(count)) = s_ij s_type 2^(beta_intercept + beta_cond) (2)

...where s_type = 2^beta_type.

But in the software and model, we keep the type effect in the exponent, as in (1). 

I have it on my list of todos, to allow for easier plotting of normalized counts, removing the calculated group/batch effects (like "type" here), to make it easier to see the condition effect for example. But this is not yet implemented. I'll ping this thread when I've added something to the devel branch.

ADD COMMENTlink written 4.4 years ago by Michael Love25k

Awesome, thanks!

ADD REPLYlink written 4.4 years ago by feargalr0

Michael,

How can I modify the model and extract batch effect-normalized counts in R commands? Or do I have to overwrite some implemented functions?

Tom

ADD REPLYlink written 3.1 years ago by Tom10

See here:

DESeq2 - Acquiring batch-corrected values for PCA and hierarchical clustering

ADD REPLYlink written 3.1 years ago by Michael Love25k

Thank you for your reply.

When I do removeBatchEffect(), should the 'batch' function be set, i.e. removeBatchEffect(rlogMat, batch=MyBatch)?

With 'batch=NULL', I only got the same matrix as the input rlogMat, and with 'batch=MyBatch', the output matrix was somewhat different.

Because, as far as I understand, rlog/vst are the values already normalized for mean shifts, I am wondering whether setting 'batch=MyBatch' for removeBatchEffect() would be double counting?

Thanks,

ADD REPLYlink written 3.1 years ago by Tom10

Yes, you should provide the variable, for which you want removeBatchEffect() to remove associated variation.

No, rlog and VST do not normalize for mean shifts associated with variables in the design. This is explained in the vignette in the section on transformations and in the man pages for the functions.

The only information that uses the design is the global trend (the trend across all genes) of estimated dispersions over the mean (when blind=FALSE).

ADD REPLYlink written 3.1 years ago by Michael Love25k

Thanks, I was assuming rlog/vst normalize batch-based trend when blind=FALSE.

If the data has obvious batch effect and I want to do downstream analyses such as unsupervised clustering with batch effect-free expression values, would removeBatchEffect() output be more appropriate than rlogMat?

ADD REPLYlink written 3.1 years ago by Tom10
I'm suggesting you pass the VST or rlog transformed matrix to removeBatchEffect().
ADD REPLYlink written 3.1 years ago by Michael Love25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 414 users visited in the last hour