I am interested in finding genes with large dispersion values in the same condition (all the samples are biological replicates of the same condition) and I do not want to make the assumption that genes with similar expression levels have similar dispersion values. This is why I am interested in getting the dispersion values before fitting/shrinking towards the curve.
I have several questions: I see there is a column named dispGeneEst in mcols(dds).
- Are the values in the dispGeneEst column the dispersion values before fitting?
- What does it mean if a gene has the maximum dispersion value of 10 or the minimum value of 1.00E-08?
- Is it correct to use the dispGeneEst values in my case?
- Is the dispersion value of a gene based on three biological replicates is reliable or do I need more replicates?
- If I want to run DESeq2 without comparing two conditions, just for getting the normalized counts and the dispersion values, is it enough to specify design=~ 1 in the DESeqDataSetFromMatrix function?
Thank you very much.
All the best,