Question

edgeR: Error in if (any(input.mean < 0)) stop("input.mean must be non-negative") :

0

Entering edit mode

bastian.hornung • 0

@bastianhornung-8471

Last seen 8.8 years ago

Netherlands

Hi everyone,

I have a problem with edgeR right now, and I cannot figure out what the cause is.

I use these commands:

x <- read.delim("input.csv",row.names="Contigs")
group <- c(1,1,2,2,3,3,1,1,4,4,5,5,1,1,6,6,7,7,8,8,9,9,10,10,11,11,12,13,13,14,14,15,15,16,16)
library("edgeR")
d <- DGEList(counts=x, group=group)
d$samples$lib.size <- colSums(d$counts)
d <- calcNormFactors(d)
d <- estimateCommonDisp(d, verbose=TRUE)

And then get the following error.

Error in if (any(input.mean < 0)) stop("input.mean must be non-negative") :
  missing value where TRUE/FALSE needed
Calls: estimateCommonDisp -> equalizeLibSizes -> q2qnbinom

The file does not contain any negative values (but quite a few 0), so I don't understand where this error is coming from.

Any help would be appreciated :).

(I'd like to attach the input file, but right now can't find a way to do that, help o_O)

EDIT: The input is here: http://www.mediafire.com/view/1pv8ocwlll2cth6/input.test.csv

edger • 2.5k views

ADD COMMENT • link updated 8.8 years ago by Gordon Smyth 50k • written 8.8 years ago by bastian.hornung • 0

score 0 · Answer 1 · 2015-07-24

That's a pretty crazy data set. Only 8 samples have counts for more than 50% of the contigs (27 have fewer than 50% of the contigs counted, and 19 have counts for fewer than 10% of the contigs). Anyway, when you run calcNormFactors(), you are converting all of your normalization factors to NaN, which will obviously keep any downstream analysis from functioning properly:

> dglst <- DGEList(dat[,-1])
> dglst$samples
    group lib.size norm.factors
X1      1     1357            1
X2      1     1554            1
X3      1       33            1
X4      1       84            1
X5      1       98            1
X6      1       41            1
X7      1     1597            1
X8      1      693            1
X9      1       30            1
X10     1       69            1
X11     1      116            1
X12     1      153            1
X13     1      740            1
X14     1      849            1
X15     1      123            1
X16     1      105            1
X17     1      119            1
X18     1      121            1
X19     1      143            1
X20     1      343            1
X21     1      131            1
X22     1      113            1
X23     1       76            1
X24     1       59            1
X25     1      132            1
X26     1      163            1
X27     1       94            1
X28     1        5            1
X29     1       13            1
X30     1       17            1
X31     1       13            1
X32     1       73            1
X33     1       93            1
X34     1       27            1
X35     1       35            1
> dglst <- calcNormFactors(dglst)
> dglst$samples
    group lib.size norm.factors
X1      1     1357          NaN
X2      1     1554          NaN
X3      1       33          NaN
X4      1       84          NaN
X5      1       98          NaN
X6      1       41          NaN
X7      1     1597          NaN
X8      1      693          NaN
X9      1       30          NaN
X10     1       69          NaN
X11     1      116          NaN
X12     1      153          NaN
X13     1      740          NaN
X14     1      849          NaN
X15     1      123          NaN
X16     1      105          NaN
X17     1      119          NaN
X18     1      121          NaN
X19     1      143          NaN
X20     1      343          NaN
X21     1      131          NaN
X22     1      113          NaN
X23     1       76          NaN
X24     1       59          NaN
X25     1      132          NaN
X26     1      163          NaN
X27     1       94          NaN
X28     1        5          NaN
X29     1       13          NaN
X30     1       17          NaN
X31     1       13          NaN
X32     1       73          NaN
X33     1       93          NaN
X34     1       27          NaN
X35     1       35          NaN

I don't have any suggestions for you - I am not sure you can do anything reasonable with such sparse data. Perhaps either Gordon Smyth or Aaron Lun will have some ideas that you can try.

score 0 · Answer 2 · 2015-07-31

Yes, it's a crazy dataset, but you can still analyse it in edgeR.

Note you can get results out of calcNormFactors() specifying a refColumn or by filtering out low count Contigs before you run calcNormFactors().

But you don't need to run calcNormFactors() to use edgeR, and here it is better to omit because there aren't enough positive counts to make scaling normalization meaningful.

An MDS plot shows clear differences between the groups:

> counts <- read.delim("input.test.csv",row.names="Contigs")
> keep <- rowSums(counts)>4
> dge <- DGEList(counts=counts[keep,])
> group <- c(1,1,2,2,3,3,1,1,4,4,5,5,1,1,6,6,7,7,8,8,9,9,10,10,11,11,12,13,13,14,14,15,15,16,16)
> plotMDS(dge, labels=group)

The safest DE method would be QL in edgeR. For example, to find contigs DE in group 8 vs group 1:

> group <- factor(group)
> design <- model.matrix(~group)
> fit <- glmQLFit(dge,design,robust=TRUE)
> plotQLDisp(fit)
> results <- glmQLFTest(fit,coef="group8")
> topTags(results)