Question

edgeR estimateGLMRobustDisp fails with and error

0

Entering edit mode

Robert Ivanek ▴ 730

@robert-ivanek-5892

Last seen 4 months ago

Switzerland

Hi,

I wanted to estimate the dispersions for RNAseq dataset using the robust version in edgeR package. After running following line of code I got an error in recent version of edgeR/Bioconductor despite the fact that identical code work on the same dataset in previous version of edgeR/Bioconductor:

dge <- estimateGLMRobustDisp(dge, design=moma)

Error in .compressOffsets(y, lib.size = lib.size, offset = offset) : offsets should be finite values

Here is the traceback:

12: stop(err)
11: .compressOffsets(y, lib.size = lib.size, offset = offset)
10: aveLogCPM.default(y, weights = weights)
9: aveLogCPM(y, weights = weights)
8: estimateGLMCommonDisp.default(y[bin, ], design, method = method.bin,
       offset[bin, ], min.row.sum = 0, weights = weights[bin, ],
       ...)
7: estimateGLMCommonDisp(y[bin, ], design, method = method.bin,
       offset[bin, ], min.row.sum = 0, weights = weights[bin, ],
       ...)
6: dispBinTrend(y, design, offset = offset, method.trend = "loess",
       AveLogCPM = AveLogCPM, weights = weights, ...)
5: estimateGLMTrendedDisp.default(y = y$counts, design = design,
       offset = getOffset(y), AveLogCPM = y$AveLogCPM, method = method,
       weights = y$weights, ...)
4: estimateGLMTrendedDisp(y = y$counts, design = design, offset = getOffset(y),
       AveLogCPM = y$AveLogCPM, method = method, weights = y$weights,
       ...)
3: estimateGLMTrendedDisp.DGEList(y, design = design, method = trend.method)
2: estimateGLMTrendedDisp(y, design = design, method = trend.method)
1: estimateGLMRobustDisp(dge, design = moma)

And here is the sessionInfo:

R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS release 6.5 (Final)

locale:
 [1] LC_CTYPE=en_US       LC_NUMERIC=C         LC_TIME=en_US        LC_COLLATE=C        
 [5] LC_MONETARY=en_US    LC_MESSAGES=en_US    LC_PAPER=en_US       LC_NAME=C           
 [9] LC_ADDRESS=C         LC_TELEPHONE=C       LC_MEASUREMENT=en_US LC_IDENTIFICATION=C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] inline_0.3.14          org.Hs.eg.db_3.4.0     edgeR_3.16.2           limma_3.30.2          
 [5] GenomicFeatures_1.26.0 AnnotationDbi_1.36.0   Biobase_2.34.0         GenomicRanges_1.26.1  
 [9] GenomeInfoDb_1.10.1    IRanges_2.8.1          S4Vectors_0.12.0       BiocGenerics_0.20.0   

loaded via a namespace (and not attached):
 [1] XVector_0.14.0             zlibbioc_1.20.0            GenomicAlignments_1.10.0  
 [4] BiocParallel_1.8.1         lattice_0.20-34            tools_3.3.1               
 [7] SummarizedExperiment_1.4.0 grid_3.3.1                 DBI_0.5-1                 
[10] Matrix_1.2-7.1             rtracklayer_1.34.1         bitops_1.0-6              
[13] RCurl_1.95-4.8             biomaRt_2.30.0             RSQLite_1.0.0             
[16] compiler_3.3.1             Biostrings_2.42.0          Rsamtools_1.26.1          
[19] locfit_1.5-9.1             XML_3.98-1.4

To me it seems that the function fails to calculate average logCPM values for one part/bin of genes which have very low expression (columnSums/lib.size for this genes is zero for some samples and therefore it cannot be log transformed and correctly used for average logCPM calculation. Afterwards the check if offsets are finite fails.

Best

Robert

rnaseq edger • 1.7k views

ADD COMMENT • link updated 7.5 years ago by Aaron Lun ★ 28k • written 7.5 years ago by Robert Ivanek ▴ 730

score 0 · Answer 1 · 2016-11-10

In the latest release, we added more stringent and consistent checks for the validity of offsets, library sizes, weights, etc. throughout edgeR's R code. The idea was to avoid breaking the C++ code due to invalid inputs (like loops that run forever when trying to evaluate a missing value as false). However, as a consequence, there have been bugs popping up in some of the more neglected functions, where previously happy code no longer accepts an invalid input. We fixed most of these, but clearly not all of them - we'll try to commit a fix as soon as possible.