Question: DESeq2: Do basemean values take into account interaction effects?
0
2.9 years ago by
snamjoshi8730
snamjoshi8730 wrote:

In a DESeq2: Appropriate way to deal with knockouts in experiment design (RIPSeq) I asked about how DESeq2 could be used to take the effects of sequencing from knockout (KO) tissue. This is possible by using interactions. If I do this, the results column returns a "basemean". Looking at some other posts, I understand this is calculated by taking the mean of the normalized count data.

I would like to run clustering or classification algorithms on my counts AFTER the effect of the KO sequences have been taken into account. Would it be appropriate to use the data from basemean to do this? In other words, assuming I have built my model to take into account the interaction of the KO, is basemean giving me normalized counts adjusted for the effect of the KO? Looking at how it is calculated, I'm not so sure but I might be misunderstanding something...

The second issue is that according to DESeq2 baseMean counts, basemean returned by DESeq2 does not take transcript length into account so it's probably not appropriate for my downstream applications. If I use the counts() function though, I don't see how I can get back counts after taking into account the KO tissue. It just gives me back all the counts for all samples.

So is there a way to obtain normalized counts from DESeq2 that I can use in downstream applications that have taken into account the effects from my KO sequences?

modified 2.9 years ago by Steve Lianoglou12k • written 2.9 years ago by snamjoshi8730
Answer: DESeq2: Do basemean values take into account interaction effects?
1
2.9 years ago by
Denali
Steve Lianoglou12k wrote:

baseMean is simply the average number of counts (adjusted for sequencing depth) of the gene across your entire DESeqDataSet. The design of the experiment is not taken into account at all.

Further, it's really not clear what you mean by "taking into account the KO effect" -- how do you want to account for it? Do you want to treat it as a batch effect (of sorts) and regress it out? Seems strange, but it's your analysis, so:

If that's the case, you'll want to transform your counts using one of DESeq2's variance stabilizing transforms (eg. vst, rlog, normTransform) then you can use limma's removeBatchEffect function, passing along a vector of WT/KO entries as its batch parameter.

When I say "taking into account" I mean that since my experiment involves extracting an RNA subpopulation with beads and an antibody some background binding will occur and so some of the counts in the experimental could be inflated simply because RNAs may bind to the anchor used to isolate the subpopulation. The KO gives you a sense of how much background counts you are getting. So "taking into account means": given that my experimental counts are inflated, how much of this effect is actually due to what we see in the KO.

DESeq2 will allow me to consider WT and KO as part of a design setup with interaction terms. It then gives me differentially expressed genes and fold-changes. But I want to see what the normalized counts are AFTER model fitting where interaction terms are taken into account. Not just the fold-changes.

I am not certain that what you suggested with a batch effect will give me what I am looking for.