Hello,
After constructing DESeqDataSet, I firstly pre-filtered genes of zero counts in all samples , then running DEseq and results. However, I found that padj values of some genes in pre-filtering condition were differnt from non-prefiltering. For example,
non-prefiltering prefiltering
ENSMUSG00000115946 0.031424406 NA
ENSMUSG00000056290 0.050350481 0.049762917
I am confused with that . Thanks a lot for your kindly reply.
Thanks for your kindly reply.
I am confused about how pre-filtering genes of zero counts in all samples got different results from non-prefiltering. For example, the padj values of ENSMUSG00000056290 were 0.049762917 and 0.050350481 in prefiltering and non-prefiltering,respectively. I only removed genes of which the number of count was zero in all samples.
sincerely!
I think the grid over baseMean values is slightly different if you add in rows with all zeros. Instead of starting the grid over quantiles of baseMean at the smallest baseMean it has to start at a higher quantile to skip over all the zeros. This different grid of quantities may produce different filtering.
Thank you very much!
Hello,
I have one more question. As prefiltering and non-prefiltering genes of zero counts in all samples will give different padj values, may I choose prefiltering or not, and which padj values I should take.
Sincerely.
I find it useful to get rid of rows with all zeros:
https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#pre-filtering