Question

edgeR no residual degrees of freedom for a specific row of numbers in dispCoxReidInterpolateTagwise

0

Entering edit mode

Koen Van den Berge ▴ 180

@koen-van-den-berge-6369

Last seen 5 months ago

Ghent University, Belgium

I got interested in edgeR's capability of dealing with low counts and came across an error which turned out to be non-reproducible if you follow the function's body code line by line. I made a reproducible example for the non-reproducible error :-).

`counts = matrix(c(3,0,0,0,0,0,

           0,0,0,0,4,0,
           0,3,0,0,0,0,
           0,0,0,3,0,0,
           0,3,0,1,0,0,
           0,0,0,0,3,0,
           0,0,4,0,0,0,
           0,3,0,0,0,0,
           3,0,0,0,0,0,
           3,0,0,0,0,0,
           0,0,0,8,9,0),byrow=TRUE,nrow=11,ncol=6)

dispersion= 1.077855e-07
design = model.matrix(~factor(rep(c("A","B"),each=3))`

Now if you run edgeR's base function for estimating tag wise dispersions

dispCoxReidInterpolateTagwise(y=counts, dispersion=1.077855e-07, design=model.matrix(~factor(rep(c("A","B"),each=3))))

Error in dispCoxReidInterpolateTagwise(y = counts, dispersion = 1.077855e-07,  : 

  no residual degrees of freedom

You get the error message saying it has no residual degrees of freedom. Curiously, this error is not printed when running over the function's body code using the same arguments.
Even more curious: the function works when you remove a seemingly problematic line, which is the last line in this case. It is similar when you have a matrix of a 100 genes: this one line seems to produce the error.

dispCoxReidInterpolateTagwise(y=counts[-11,], dispersion=1.077855e-07, design=model.matrix(~factor(rep(c("A","B"),each=3))))

Which returns the tag wise dispersions. I realise the dispersion is very low, but it is what is estimated from the larger matrix of low counts and I think this should not pose an error. I have been looking at the dispCoxReidInterpolateTagwise function because the error was called when I ran estimateTagwiseDisp on the matrix.

<font face="monospace">How to fix this error?</font>

<font face="monospace">I have edgeR version 3.12.0 installed.</font>

edger rna-seq • 1.6k views

ADD COMMENT • link updated 8.9 years ago by Aaron Lun ★ 28k • written 8.9 years ago by Koen Van den Berge ▴ 180

score 3 · Accepted Answer · 2016-01-11

3

Entering edit mode

Aaron Lun ★ 28k

@alun

Last seen 3 minutes ago

The city by the bay

This is a bug that stems from failing to put drop=FALSE during an internal subsetting step prior to Recall. I'll ask Andy to have a look at this. In any case, we're slowly migrating to estimateDisp and we're suggesting that others do the same. This doesn't use dispCoxReidInterpolateTagwise so it shouldn't be affected.

ADD COMMENT • link 8.9 years ago Aaron Lun ★ 28k

2

Entering edit mode

Yes, it is a bug in the internal subsetting step. It's been fixed and it will be updated online soon.

Note that the dispCoxReidInterpolateTagwise function uses min.row.sum=5 to filter out lowly expressed genes by default. Only genes (or rows) with total sum of counts above 5, which is the last row only in the example, are used for dispersion estimation. This bug affects the data of which only one gene is above the threshold. When the seemingly problematic line (the last line) is removed, it simply returns the input values of dispersion, i.e., it does nothing (or it works).

ADD REPLY • link 8.9 years ago Yunshun Chen ▴ 890

0

Entering edit mode

Thank you Yunshun and Aaron, this cleared things up. Indeed, it works when using the estimateDisp rather than the estimateGLMTagwisdeDisp function, this was a great help.

ADD REPLY • link 8.9 years ago Koen Van den Berge ▴ 180

0

Entering edit mode

Thank you for your reply. I realise the usefulness of the estimateDisp function though I need the estimateGLMTagwise function to work as I want the unshrunken tagwisde dispersion (i.e. in my estimateGLMTagwise function I set the prior.df to zero).

ADD REPLY • link 8.9 years ago Koen Van den Berge ▴ 180

0

Entering edit mode

You can do the same for estimateDisp, if so inclined. But you should be very careful about how you use the resulting estimate. The dispersion will not be stably estimated without shrinkage if you only have ~6 libraries, so the resulting inferences from any fitted model are likely to be quite unstable.

ADD REPLY • link 8.9 years ago Aaron Lun ★ 28k