Question: Dealing with genes that have Padj=NA
3
gravatar for raya.fai
3.6 years ago by
raya.fai30
raya.fai30 wrote:

Hello,

I have a question about  the genes that get Padj = NA.

In my experiment I compare 3 control samples and 3 treated samples. Out of 4500 genes, about 1800 get Padj = NA. I wish to understand how to treat these genes: as not changed genes or to exclude them from my analysis. Since I want to do a Fisher test on the data it is important for me to know for each gene if it changed, did not change or undetermined.

As I understand from the vignette this happens because of the automatic independent filtering. I read in section 3.8 that this is an optimization of the FDR correction (optimizing the number of genes which will have an adjusted p value below a given FDR cutoff, alpha).

I also read that it is possible to remove the independent filtering by writing independentFiltering=FALSE in the results function.

My question is how to treat these Padj=NA genes and what do I lose if I run DEseq2 without the independent filtering?

Thank you very much,

Raya Romm, PhD student

The Hebrew University of Jerusalem

 

deseq2 • 7.2k views
ADD COMMENTlink modified 11 months ago by Fuqi Xu10 • written 3.6 years ago by raya.fai30
Answer: Dealing with genes that have Padj=NA
5
gravatar for Michael Love
3.6 years ago by
Michael Love24k
United States
Michael Love24k wrote:

The genes with adjusted p-value of NA have less mean normalized counts than the optimal threshold. You can make the same plot as in the vignette to see how the power increases when the threshold increases.

There's no right answer of exactly what filter to use, it's a sliding scale of "counts high enough to have good power to detect differential expression". One choice is to optimize the power, as detailed in the independent filtering reference by Bourgon, which is what you get by default.

If you want to include more genes (so have less NA adjusted p-values) you can pick a lower threshold using this plot, and then:

results(dds, independentFiltering=FALSE)

res$pvalue[res$baseMean < x] <- NA

res$padj <- p.adjust(res$pvalue, method="BH")
ADD COMMENTlink written 3.6 years ago by Michael Love24k
Answer: Dealing with genes that have Padj=NA
1
gravatar for Fuqi Xu
11 months ago by
Fuqi Xu10
Fuqi Xu10 wrote:

I came across the same problem when analyzing my data. This is how I dealt with it. If P value = NA, there is an extreme count, which is defined by Cook's distance. So the simplest way is to delete the abnormal observations until it returns a valid p-value.

Also, we need to make sure those observations are deletable. This observation doesn't have special biological meaning and deleting those observations doesn't change much of the p-value of other genes.

 

 

ADD COMMENTlink written 11 months ago by Fuqi Xu10

can you explain with more detail what should i do?

ADD REPLYlink written 8 months ago by anaQ0
Answer: Dealing with genes that have Padj=NA
0
gravatar for raya.fai
3.6 years ago by
raya.fai30
raya.fai30 wrote:

Hi Michael,

Thank you for your answer.

Raya

ADD COMMENTlink written 3.6 years ago by raya.fai30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 382 users visited in the last hour