Question

P values from Deseq Analysis

0

Entering edit mode

Shrikant Pawar • 0

@spawar2-12992

Last seen 3.6 years ago

USA/New Haven/Yale University

Dear All,

I was trying to analyze my NGS data with DESeq package and have couple of questions with the same. Could you help me in answering these problems.

After I apply nbinomTest, I get P values in range of 0-1. And fold changes of greater and lesser than 1.5. Assuming these as up and down-regulated how do you interpret P values for them. If the fold change is greater than 1.5 than it’s an upregulated gene what should be its expected P value (Normally if P value is less than 0.05 it’s significant but in my case all the P values are between 0-1).
If I look at the adjusted P values for them most of them are equal to 1. So am confused if there is an up or down regulated gene which P value is significant and how do I report them in paper.

Thanks!

deseq • 6.1k views

ADD COMMENT • link 7.2 years ago Shrikant Pawar • 0

0

Entering edit mode

1. If your p-values are between 0 and 1 you also have p-values below 0.05. Or do you mean you have values above 0.05?

2.The p-value indicates the probability to find such values from a given null (H0) hypothesis. You can have a p-value of 0.5 and still the gene is of interest. To report in a paper, be clear about your decisions and motives and you'll be fine. Usually they are reported in a table.

ADD REPLY • link 7.2 years ago Lluís Revilla Sancho ▴ 730

0

Entering edit mode

Hello Lluís R, thanks for your reply.

1. From about 1000 genes, I get max 10 genes with P value less than 0.05, rest of them between 0.1 and 1, so am confused what should be my threshold for calling them significant genes.

2."The p-value indicates the probability to find such values from a given null (H0) hypothesis. You can have a p-value of 0.5 and still the gene is of interest. To report in a paper, be clear about your decisions and motives and you'll be fine. Usually they are reported in a table." This sounds reasonable, thanks.

ADD REPLY • link 7.2 years ago Shrikant Pawar • 0

0

Entering edit mode

1.If you change the threshold for the p-value now you will be doing p-value fishing. There are few genes under your threshold. Period. Probably you might want to analyze the problem, maybe you have very few samples and you need more power, maybe there is batch effect or other confounding factors, or you need to normalize better the samples, or maybe it is just that the samples are not that different.

ADD REPLY • link 7.2 years ago Lluís Revilla Sancho ▴ 730

0

Entering edit mode

No. It is correct that a p=0.5 does not show that the gene is uninteresting, but that does not mean that it can still be worth reporting. A high p values simply says that the experiment failed to provide any new insight into the gene, and hence that you cannot draw any conclusions from that data.

ADD REPLY • link 7.2 years ago Simon Anders ★ 3.7k

score 4 · Answer 1 · 2017-05-10

You don't have any significant genes.

Remember the definition of a p-value: If you test N genes, and in reality, there are no differences at all, just random fluctuation, then, by definition of the p-value, you expect that pN genes show a p-value smaller then p (because p-values are supposed to have a uniform distribution under the null hypothesis).

Hence, if you have, say, 10,000 genes, then you expect 500 of them to have p<0.05 just by chance. As you see even less than that, your experiment has clearly failed.

Look up "Benjamini-Hochberg adjustment" to understand adjusted p values, and why you should look at these and not at raw p values when assessing significance in high-throughput experiments.

score 0 · Answer 2 · 2017-05-10

0

Entering edit mode

Shrikant Pawar • 0

@spawar2-12992

Last seen 3.6 years ago

USA/New Haven/Yale University

Thanks Dr. Anders and Dr. Lluís R, your inputs are valuable guess my experiment didn't show any significant differences. I will look up at "Benjamini-Hochberg adjustment" to understand adjusted p values.

One quick question: What do you mean by normalizing samples better? I applied DESeq on RPKM values with 4 replicates in treatment and 4 replicates in control.

Thank You,

ADD COMMENT • link 7.2 years ago Shrikant Pawar • 0

1

Entering edit mode

Can you apply DESeq to RPKM values ? I thought you had to use HTSeq, or something similar, to get integer counts for each gene for input into DESeq. Matthew On 5/10/2017 6:20 PM, spawar2 [bioc] wrote: > Activity on a post you are following on support.bioconductor.org > <https: support.bioconductor.org=""> > > User spawar2 <https: support.bioconductor.org="" u="" 12992=""/> wrote Answer: > P values from Deseq Analysis > <https: support.bioconductor.org="" p="" 95727="" #95787="">: > > Thanks Dr. Anders and Dr. Lluís R, your inputs are valuable guess my > experiment didn't show any significant differences. I will look up > at "Benjamini-Hochberg adjustment" to understand adjusted p values. > > One quick question: What do you mean by normalizing samples better? I > applied DESeq on RPKM values with 4 replicates in treatment and 4 > replicates in control. > > Thank You, > > ------------------------------------------------------------------------ > > Post tags: deseq > > You may reply via email or visit > A: P values from Deseq Analysis >

ADD REPLY • link 7.2 years ago Matthew McCormack ▴ 180

0

Entering edit mode

Sure Dr. Matthew McCormack, will apply HTSeq, or something similar, to get integer counts and then apply DESeq on integer counts. Thanks.

ADD REPLY • link 7.1 years ago Shrikant Pawar • 0

0

Entering edit mode

When I say "normalize" better I mean if you check for batch effects, or bias by GC content and/or gene length. Also if your design is appropriate and reflects the experiment, (if it is paired, if there are some known co-factors...). At least those are the things I check when using limma.

ADD REPLY • link 7.2 years ago Lluís Revilla Sancho ▴ 730

0

Entering edit mode

Sure I will check the batch effects within. Thanks.

ADD REPLY • link 7.1 years ago Shrikant Pawar • 0

0

Entering edit mode

No, you should not use DESeq on RPKM data.

May I suggest at this point that you read the manual first before proceeding?

Start here: https://www.bioconductor.org/help/workflows/rnaseqGene/

ADD REPLY • link 7.2 years ago Simon Anders ★ 3.7k

0

Entering edit mode

Thanks Dr. Anders, guess thats the issue. Will update on applying DESeq on raw counts.

ADD REPLY • link 7.1 years ago Shrikant Pawar • 0

score 0 · Answer 3 · 2017-05-11

0

Entering edit mode

Shrikant Pawar • 0

@spawar2-12992

Last seen 3.6 years ago

USA/New Haven/Yale University

Thanks a lot, I will go back and check for batch effects and GC content.

ADD COMMENT • link 7.2 years ago Shrikant Pawar • 0