Question

Finding unchanged genes using altHypothesis = "lessAbs"

1

Entering edit mode

raya.fai ▴ 60

@rayafai-9396

Last seen 10 months ago

Israel

Hi,

I am using DESeq2 for traditional RNAseq data analysis for finding genes that their expression level changed and I am also looking for genes that their level did not change. I later take these two groups of genes are characterize each group.

I wanted to ask you what is more correct for taking the unchanged genes: are these all the genes that did not get a significant padj value or should I use the altHypothesis="lessAbs" and take from this analysis the significant genes which should be the stable genes?

Thank you very much,

Raya

deseq2 lessAbs • 2.6k views

ADD COMMENT • link 5.9 years ago raya.fai ▴ 60

0

Entering edit mode

Thank you very much for your quick as always answer.

Raya

ADD REPLY • link 5.9 years ago raya.fai ▴ 60

score 1 · Answer 1 · 2018-05-10

1

Entering edit mode

Michael Love 41k

@mikelove

Last seen 7 hours ago

United States

"what is more correct for taking the unchanged genes: are these all the genes that did not get a significant padj value"

No, this is not appropriate. Failure to reject the null cannot be taken as evidence of no difference. We discuss this in the DESeq2 paper:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4302049/

"or should I use the altHypothesis="lessAbs""

Yes, this method in DESeq2 was specifically designed to allow you to find unchanging genes.

We swap the null and alternative hypotheses (and replace the composite null with two simple nulls, which is conservative). It only requires that you specify a minimum effect size, as do all "tests of equivalence".

ADD COMMENT • link 5.9 years ago Michael Love 41k

0

Entering edit mode

Hi,

I hope I can bother you with another question regarding this topic. In order to find the significantly changed genes I used:

ddseq <- DESeq(ddseq)
res <- results(ddseq)

And in order to find the unchanged genes, as you suggest, I used:

ddsNoPrior <- DESeq(ddseq, betaPrior=FALSE)
resLA <- results(ddsNoPrior, lfcThreshold=0.5, altHypothesis="lessAbs")

My question is what do I decide for genes that get significant padj values in both cases? I guess it happens since I do not use any threshold for the LFC for the significantly changed genes.

Is it OK to say that these genes do change significantly even though they also get a significant padj value with the lessAbs alternative hypothesis?

Thank you again,

Raya

ADD REPLY • link 5.9 years ago raya.fai ▴ 60

0

Entering edit mode

Yes, it's because you can have a significant LFC of 0.25 in the first part, for example. Why not use a lfcThreshold of 0.5 for the first table? In other words, if you define < 0.5 as not changing, then why are you interested in finding these genes as part of the normal DE analysis?

ADD REPLY • link 5.9 years ago Michael Love 41k

0

Entering edit mode

Thank you very much for your answer.

I thought that for the not changing genes I need to use the altHypothesis parameter that also requires the lfcThreshold parameter. Am I wrong? How can I get the not changing genes without a lfcThreshold?

For the changing genes I also do not want a lfcThreshold because even a small but significant change is important to me.

Thank you again,

Raya

ADD REPLY • link 5.9 years ago raya.fai ▴ 60

0

Entering edit mode

Yes, you need to use lfcThreshold to find "unchanging".

I'm suggesting you should also use lfcThreshold for the first table to define DE, but it's up to you. If you don't, then you have to accept that there may be some genes (e.g. the ones right in the middle) that may be significantly less than your threshold for practically unchanging, but significantly more than 0.

ADD REPLY • link 5.9 years ago Michael Love 41k

0

Entering edit mode

Thank you very much for your help

ADD REPLY • link 5.9 years ago raya.fai ▴ 60

0

Entering edit mode

How to define a statistically significant result when testing for equivalence? How to do multiple testing comparison in the case when i use "lfcThreshold = log2(max_FC), altHypothesis="lessAbs"" to find the equivalent genes and "lfcThreshold = log2(min_FC), altHypothesis="greaterAbs"" for the DE genes. Do I need to take only the P values that altready fulfill respective conditions and then do a FDR correction to those genes? Otherwise a lot of genes get a P value of 1 and this makes the whole FDR process mostly giving out FDR = 1 results.

ADD REPLY • link 5.3 years ago marioreiman • 0

0

Entering edit mode

FDR correction is done for you. The genes with a p-value and adjusted p-value of 1 are those which are outside of the alternative region. It's not a problem for constructing FDR bounded sets when the alternative hypothesis is equivalence.

ADD REPLY • link 5.3 years ago Michael Love 41k

0

Entering edit mode

Thank you for the fast reply!

What value would you suggest for the "max_FC" parameter in the "lfcThreshold = log2(max_FC)" argument?

Qurrently i used max_FC=min_FC=1.35, bacause at that point I had about and equal number of EE and DE genes. However, I am not certain if it is correct and would want to know if there is a better way to dial in the lfcThreshold value

ADD REPLY • link 5.3 years ago marioreiman • 0

0

Entering edit mode

I would discuss with your collaborators and choose a biologically motivated value. I wouldn’t pick a value based on equal numbers of genes.

ADD REPLY • link 5.3 years ago Michael Love 41k