camera.DGEList causes Error in qr.qty(QR, t(y)) : NA/NaN/Inf in foreign function call
1
0
Entering edit mode
@cei-abreu-goodger-4433
Last seen 9.8 years ago
Mexico

Dear Gordon,

I've come across a problem with camera.DGEList() that seems related to extreme difference in counts between conditions. I have a very small example of my dataset that reproduces the problem, following this code:

load(url("http://datos.langebio.cinvestav.mx/~cei/camera_problem.rdata"))

library(edgeR)

camera(y, c(1,2))
Error in qr.qty(QR, t(y)) : NA/NaN/Inf in foreign function call (arg 5)

The full dataset actually has many lines that cause this error. The problem seems to be due to zscoreNBinom() returning -Inf values, according to the following:

design <- model.matrix(~y$samples$group)

nbeta <- 2

contrast <- c(0,1)

QR <- qr(contrast)

design0 <- t(qr.qty(QR, t(design))[-1, , drop = FALSE])

fit.null <- glmFit(y, design0, prior.count = 0)

zscoreNBinom(y$counts, mu = fit.null$fitted.values, size = 1/ getDispersion(y))[2,]

AT2G34655       -Inf      -Inf 12.527082 13.996953

 

sessionInfo()

R version 3.1.0 (2014-04-10)

Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:

[1] C/UTF-8/C/C/C/C

attached base packages:

[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:

[1] edgeR_3.6.8  limma_3.20.9

 

Any suggestions?

Cheers,

Cei

edgeR • 1.5k views
ADD COMMENT
0
Entering edit mode

If the data set is not very big, it might be useful for us to have a look at it in its entirety (anonymized, if necessary). We have a couple of ideas as to how to fix this problem, ranging from stop-gap fixes to long-term solutions. The temporary fixes work alright on the rows you've provided, but it'd be nice to know whether it works on the entire thing. Otherwise, we'd just change directly to a more robust long-term implementation.

ADD REPLY
0
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 3 hours ago
The city by the bay

The problem in zscoreNBinom occurs when the DE is too strong, such that we get floating point underflow in the computed probabilities. We've patched this in the latest version of edgeR (3.8.6), mostly by switching to log-probabilities in the intermediate calculations. We've also added an approximate deviance-based method as a safety net, just in case there is any failure of the approximations used internally in pnbinom (especially at the tails of the distribution).

So, long story short, the new zscoreNBinom should survive anything your dataset can throw at it. Let us know if that's not the case.

ADD COMMENT

Login before adding your answer.

Traffic: 800 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6