Question

How to shortlist the unaffected gene list from DESeq2 results

1

Entering edit mode

rohitsatyam102 ▴ 20

@rohitsatyam102-24390

Last seen 5 weeks ago

India

Hi

I was curious to fish out genes that are unaffected no-matter what was the condition. I am comparing GE in two tissues and I was hoping to find some genes whose expression doesn't vary.

For such genes, which do not show changes at genic level, I wish to see if there are changes at the isoform level or not.

Having said that I was using the following code to subset the genes (considering genes with FC<1.5 as unaffected genes)

subset(res, padj<=0.05 & abs(l$log2FoldChange)<0.58)

However, I don't get any genes at these cutoffs

DESeq2 • 1.6k views

ADD COMMENT • link updated 2.7 years ago by Michael Love 41k • written 3.5 years ago by rohitsatyam102 ▴ 20

1

Entering edit mode

Think about it. Pvalues are the probabilities that the Null (typically that the FC is 0) is true. Hence, if you filter for p < 0.05 you get the genes that have a low probability to be zero, so the opposite of what you want. I guess a reasonable filter for strong evidence of no DE would be a large nominal p-value, a small FC close to 0 and a proper basemean to ensure genes are decently expressed.

ADD REPLY • link 3.5 years ago ATpoint ★ 4.0k

0

Entering edit mode

Thanks, ATpoint

Yes, I realized a day after posting it and going through the vignette again. Our Null hypothesis in starting is usually that there is no change in the gene expression. If the padj values < 0.05 we reject the null hypothesis and accept the alternate hypothesis that there is a significant change in gene expression. So to fish out genes that are not affected by treatment, maybe I need to consider padj>0.05 criteria. Having said that, I need to confirm this: So for genes with padj>0.05, the LFC values doesn't make any sense (even if they are as small as 0.6 or s high as 22)? So criteria of using merely padj would suffice, right?

ADD REPLY • link 3.5 years ago rohitsatyam102 ▴ 20

score 4 · Accepted Answer · 2020-11-09

4

Entering edit mode

Michael Love 41k

@mikelove

Last seen 4 minutes ago

United States

From the DESeq2 paper:

Sometimes, a researcher is interested in finding genes that are not, or only very weakly, affected by the treatment or experimental condition. This amounts to a setting similar to the one just discussed, but the roles of the null and alternative hypotheses are swapped....

We have a mechanism for performing this test, see the vignette section talking about altHypothesis.

ADD COMMENT • link 3.5 years ago Michael Love 41k

0

Entering edit mode

Oh!, I was unaware of this feature. Shall get back to you if I faced any problem using it. Thanks, Michael Love .

EDIT1: So for my purpose I have to use unshrunken values of FC

Note the lack of LFC shrinkage: to find genes with weak differential expression, DESeq2 requires that the LFC shrinkage has been disabled. This is because the zero-centered prior used for LFC shrinkage embodies a prior belief that LFCs tend to be small, and hence is inappropriate here.

So in my case wherein I am looking for unaffected genes the following would be the condition?

resLA <- results(dds, lfcThreshold=.5, altHypothesis="lessAbs")

EDIT:2 Also does setting altHypothesis=lessAbs means that now our null hypothesis is set to there is a differential gene expression. What it is doing in the background?

ADD REPLY • link 3.5 years ago rohitsatyam102 ▴ 20

0

Entering edit mode

Hi Michael Love

Please, can you confirm, if the aforementioned usage of alt hypothesis is the right way for my case?

ADD REPLY • link 3.4 years ago rohitsatyam102 ▴ 20

1

Entering edit mode

Yes that is correct. For what it is doing please refer to the 2014 paper.

ADD REPLY • link 3.4 years ago Michael Love 41k

0

Entering edit mode

HI Michael Love I quickly wish to confirm if this is the right way to subset the Non-DE genes

1: ## Subsetting non-DE genes using altHypothesis argument
res1 <- results(dds, contrast = c("group","layer2","layer1"),lfcThreshold=0.5, altHypothesis="lessAbs")
nonDE.res1 <- subset(data.frame(res1), padj > 0.05 & abs(res1$log2FoldChange) < 1)

dim(nonDE.res1)
#[1] 7186    6

2: ## Subsetting non-DE genes using what ATpoint suggested
res.test <- results(dds, contrast = c("group","layer2","layer1"),lfcThreshold=0.5)
nonDE.res.test <- subset(data.frame(res.test), padj > 0.05 & abs(res.test$log2FoldChange) < 1)
dim(nonDE.res.test)
#[1] 6221    6

length(intersect(rownames(nonDE.res1),rownames(nonDE.res.test)))
#[1] 6221

Both approaches are producing different numbers of nonDE-Genes though undoubtedly there is quite an overlap. When I further went to check which genes got missed out using the second approach those were the genes having padj = NA. Am I comparing it correctly!! I wish to know if I am subsetting nonDE genes properly or not.

ADD REPLY • link 2.7 years ago rohitsatyam102 ▴ 20

1

Entering edit mode

See the DESeq2 2014 paper and plotMA of these two res objects. It's important to understand the difference and what you are doing.

In the first case, you are testing for LFC that are less than some threshold in absolute value. So you want the small p-value genes, but here you are filtering for the large p-value genes. Make sure you follow the paper section on thresholds (also it's described in the vignette).

ADD REPLY • link 2.7 years ago Michael Love 41k