How to shortlist the unaffected gene list from DESeq2 results
1
1
Entering edit mode
@rohitsatyam102-24390
Last seen 22 days ago
India

Hi

I was curious to fish out genes that are unaffected no-matter what was the condition. I am comparing GE in two tissues and I was hoping to find some genes whose expression doesn't vary.

For such genes, which do not show changes at genic level, I wish to see if there are changes at the isoform level or not.

Having said that I was using the following code to subset the genes (considering genes with FC<1.5 as unaffected genes)

subset(res, padj<=0.05 & abs(l$log2FoldChange)<0.58)  However, I don't get any genes at these cutoffs DESeq2 • 460 views ADD COMMENT 1 Entering edit mode Think about it. Pvalues are the probabilities that the Null (typically that the FC is 0) is true. Hence, if you filter for p < 0.05 you get the genes that have a low probability to be zero, so the opposite of what you want. I guess a reasonable filter for strong evidence of no DE would be a large nominal p-value, a small FC close to 0 and a proper basemean to ensure genes are decently expressed. ADD REPLY 0 Entering edit mode Thanks, ATpoint Yes, I realized a day after posting it and going through the vignette again. Our Null hypothesis in starting is usually that there is no change in the gene expression. If the padj values < 0.05 we reject the null hypothesis and accept the alternate hypothesis that there is a significant change in gene expression. So to fish out genes that are not affected by treatment, maybe I need to consider padj>0.05 criteria. Having said that, I need to confirm this: So for genes with padj>0.05, the LFC values doesn't make any sense (even if they are as small as 0.6 or s high as 22)? So criteria of using merely padj would suffice, right? ADD REPLY 4 Entering edit mode @mikelove Last seen 1 hour ago United States From the DESeq2 paper: Sometimes, a researcher is interested in finding genes that are not, or only very weakly, affected by the treatment or experimental condition. This amounts to a setting similar to the one just discussed, but the roles of the null and alternative hypotheses are swapped.... We have a mechanism for performing this test, see the vignette section talking about altHypothesis. ADD COMMENT 0 Entering edit mode Oh!, I was unaware of this feature. Shall get back to you if I faced any problem using it. Thanks, Michael Love . EDIT1: So for my purpose I have to use unshrunken values of FC Note the lack of LFC shrinkage: to find genes with weak differential expression, DESeq2 requires that the LFC shrinkage has been disabled. This is because the zero-centered prior used for LFC shrinkage embodies a prior belief that LFCs tend to be small, and hence is inappropriate here.  So in my case wherein I am looking for unaffected genes the following would be the condition? resLA <- results(dds, lfcThreshold=.5, altHypothesis="lessAbs")  EDIT:2 Also does setting altHypothesis=lessAbs means that now our null hypothesis is set to there is a differential gene expression. What it is doing in the background? ADD REPLY 0 Entering edit mode Please, can you confirm, if the aforementioned usage of alt hypothesis is the right way for my case? ADD REPLY 1 Entering edit mode Yes that is correct. For what it is doing please refer to the 2014 paper. ADD REPLY 0 Entering edit mode HI Michael Love I quickly wish to confirm if this is the right way to subset the Non-DE genes 1: ## Subsetting non-DE genes using altHypothesis argument res1 <- results(dds, contrast = c("group","layer2","layer1"),lfcThreshold=0.5, altHypothesis="lessAbs") nonDE.res1 <- subset(data.frame(res1), padj > 0.05 & abs(res1$log2FoldChange) < 1)

dim(nonDE.res1)
#[1] 7186    6

2: ## Subsetting non-DE genes using what ATpoint suggested
res.test <- results(dds, contrast = c("group","layer2","layer1"),lfcThreshold=0.5)
nonDE.res.test <- subset(data.frame(res.test), padj > 0.05 & abs(res.test\$log2FoldChange) < 1)
dim(nonDE.res.test)
#[1] 6221    6

length(intersect(rownames(nonDE.res1),rownames(nonDE.res.test)))
#[1] 6221


Both approaches are producing different numbers of nonDE-Genes though undoubtedly there is quite an overlap. When I further went to check which genes got missed out using the second approach those were the genes having padj = NA. Am I comparing it correctly!! I wish to know if I am subsetting nonDE genes properly or not.

1
Entering edit mode

See the DESeq2 2014 paper and plotMA of these two res objects. It's important to understand the difference and what you are doing.

In the first case, you are testing for LFC that are less than some threshold in absolute value. So you want the small p-value genes, but here you are filtering for the large p-value genes. Make sure you follow the paper section on thresholds (also it's described in the vignette).