Asymptotic dispersion, DESeq2
1
0
Entering edit mode
@stevennvolant-9599
Last seen 7.0 years ago

Hi,

Does someone have any idea about the asymptotic behavior (i.e. with a large number of samples) of the dispersion estimation ?

Thank you

deseq2 • 813 views
ADD COMMENT
2
Entering edit mode
@mikelove
Last seen 14 hours ago
United States

DESeq2's estimator is the posterior mode. This converges to the unbiased 'maximum of the Cox-Reid adjusted likelihood' estimator as the sample size grows to infinity (see DESeq2 paper's Methods section, which has reference to the edgeR paper on this adjustment).

Keep in mind that, like the sample variance, the MLE for the dispersion takes longer to converge to the true value compared to estimators for the mean. Which is why sharing information across genes (using the prior distribution for genes with similar mean value) is such a good idea and improves inference.

Here's a toy example and a plot showing the posterior mode converging to the true value (orange) although it starts around the center of the prior (purple).

 

library(DESeq2)
samp.size <- c(3:12,
               2:10 * 10,
               5:12 * 25)
disps <- numeric(length(samp.size))
prior.mean <- .2
true.disp <- .1
for (i in seq_along(samp.size)) {
  cat(i)
  dds <- makeExampleDESeqDataSet(n=100, m=samp.size[i],
                                 dispMeanRel=function(x) prior.mean)
  cnts <- rnbinom(ncol(dds), mu=200, size=1/true.disp)
  mode(cnts) <- "integer"
  counts(dds)[1,] <- cnts
  sizeFactors(dds) <- rep(1, ncol(dds))
  dds <- estimateDispersions(dds, quiet=TRUE, fitType="mean")
  disps[i] <- dispersions(dds)[1]
}
plot(samp.size, disps, log="y")
abline(h=prior.mean, col="purple")
abline(h=true.disp, col="orange")

 

ADD COMMENT

Login before adding your answer.

Traffic: 711 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6