Using DESeq2 with GlmGamPoi
1
0
Entering edit mode
@user-25030
Last seen 7 months ago

I'm trying to use DESeq2 with glmGamPoi as a fit type for scRNAseq data.

If I run:

deg <- DESeq(dds, fitType="glmGamPoi")

I get a warning that glmGamPoi dispersion estimator should be used in combination with a LRT and not a Wald test., but it runs through.

Then If I run:

deg <- DESeq(dds, fitType="glmGamPoi", test = 'LRT', reduced = ~ orig.ident)

I get an error:

estimating size factors

estimating dispersions

gene-wise dispersion estimates

using 'glmGamPoi' as fitType. If used in published research, please cite:
    Ahlmann-Eltze, C., Huber, W. (2020) glmGamPoi: Fitting Gamma-Poisson
    Generalized Linear Models on Single Cell Count Data. bioRxiv.
    https://doi.org/10.1101/2020.08.13.249623

mean-dispersion relationship

final dispersion estimates

fitting model and testing

Fit reduced model

Calculate quasi likelihood ratio

Preprare results

Error in replaceOutliers(object, minReplicates = minReplicatesForReplace): first run DESeq, nbinomWaldTest, or nbinomLRT to identify outliers
Traceback:

1. DESeq(dds, fitType = "glmGamPoi", test = "LRT", reduced = ~orig.ident)
2. refitWithoutOutliers(object, test = test, betaPrior = betaPrior, 
 .     full = full, reduced = reduced, quiet = quiet, minReplicatesForReplace = minReplicatesForReplace, 
 .     modelMatrix = modelMatrix, modelMatrixType = modelMatrixType)
3. replaceOutliers(object, minReplicates = minReplicatesForReplace)
4. stop("first run DESeq, nbinomWaldTest, or nbinomLRT to identify outliers")

Following the instruction I run:

dds <- estimateSizeFactors(dds)
dds <- estimateDispersions(dds, fitType = "glmGamPoi")
dds <- nbinomLRT(dds, reduced = ~ orig.ident)
dds <- replaceOutliers(dds)

It all goes through, but when I try to run DESeq again:

deg <- DESeq(dds, fitType="glmGamPoi", test = 'LRT', reduced = ~ orig.ident)

I get a different error:

using pre-existing size factors

estimating dispersions

found already estimated dispersions, replacing these

gene-wise dispersion estimates

using 'glmGamPoi' as fitType. If used in published research, please cite:
    Ahlmann-Eltze, C., Huber, W. (2020) glmGamPoi: Fitting Gamma-Poisson
    Generalized Linear Models on Single Cell Count Data. bioRxiv.
    https://doi.org/10.1101/2020.08.13.249623

mean-dispersion relationship

final dispersion estimates

fitting model and testing

Fit reduced model

Calculate quasi likelihood ratio

Preprare results

-- replacing outliers and refitting for 2 genes
-- DESeq argument 'minReplicatesForReplace' = 7 
-- original counts are preserved in counts(dds)

estimating dispersions

fitting model and testing

Error: subscript contains NAs
Traceback:

1. DESeq(dds, fitType = "glmGamPoi", test = "LRT", reduced = ~orig.ident)
2. refitWithoutOutliers(object, test = test, betaPrior = betaPrior, 
 .     full = full, reduced = reduced, quiet = quiet, minReplicatesForReplace = minReplicatesForReplace, 
 .     modelMatrix = modelMatrix, modelMatrixType = modelMatrixType)
3. `[<-`(`*tmp*`, refitReplace, idx, value = new("DFrame", rownames = c("HBA2", 
 . "HBB"), nrows = 2L, listData = list(baseMean = c(0.761643866520817, 
 . 1.47541029264753), baseVar = c(4.4774987603094, 7.79527342821179
 . ), allZero = c(FALSE, FALSE), dispGeneEst = c(0.733618591623128, 
 . 0.332659026383691), dispGeneIter = c(15, 17), dispFit = c(0.148263041023227, 
 . 0.146465471245793), dispersion = c(0.733618591623128, 0.332659026383691
 . ), dispIter = c(14L, 14L), dispOutlier = c(TRUE, TRUE), dispMAP = c(0.701974657518914, 
 . 0.325841413372872), Intercept = c(-2.43305333148164, -2.27831949214498
 . ), cell_type.short_MEM_vs_B = c(-0.382430418117617, -0.526982820621525
 . ), cell_type.short_MONO_vs_B = c(0.527268465633085, 0.355888820362956
 . ), cell_type.short_NK_vs_B = c(-0.318913293254477, -0.370547582526146
 . ), cell_type.short_T_vs_B = c(0.0316404680549185, -0.23978329864071
 . ), orig.ident_SRA749327_SRS3693911_vs_SRA749327_SRS3693910 = c(3.29914487107158, 
 . 4.15497858516832), orig.ident_SRA749327_SRS3693912_vs_SRA749327_SRS3693910 = c(0.360272275995769, 
 . 0.901159079661736), orig.ident_SRA749327_SRS3693913_vs_SRA749327_SRS3693910 = c(2.74462008252994, 
 . 3.64340565684488), SE_Intercept = c(0.15940549992706, 0.126830571258761
 . ), SE_cell_type.short_MEM_vs_B = c(0.162153080575308, 0.121916400144907
 . ), SE_cell_type.short_MONO_vs_B = c(0.234279980801597, 0.203037672234189
 . ), SE_cell_type.short_NK_vs_B = c(0.12146183320674, 0.085550421230657
 . ), SE_cell_type.short_T_vs_B = c(0.130881978367645, 0.09306215532191
 . ), SE_orig.ident_SRA749327_SRS3693911_vs_SRA749327_SRS3693910 = c(0.127329079872187, 
 . 0.103591219573552), SE_orig.ident_SRA749327_SRS3693912_vs_SRA749327_SRS3693910 = c(0.220296188544785, 
 . 0.186780678209572), SE_orig.ident_SRA749327_SRS3693913_vs_SRA749327_SRS3693910 = c(0.142766817035964, 
 . 0.116320005758324), LRTStatistic = c(19.987838854835, 28.4351341414704
 . ), LRTPvalue = c(0.000502167370342464, 1.01796840681639e-05), 
 .     fullBetaConv = c(TRUE, TRUE), reducedBetaConv = c(TRUE, TRUE
 .     ), betaIter = c(6, 5), deviance = c(5242.07417197948, 6683.64669011397
 .     ), maxCooks = c(HBA2 = 0.816117734415004, HBB = 0.450463054748042
 .     )), elementType = "ANY", elementMetadata = new("DFrame", 
 .     rownames = NULL, nrows = 33L, listData = list(type = c("intermediate", 
 .     "intermediate", "intermediate", "intermediate", "intermediate", 
 .     "intermediate", "intermediate", "intermediate", "intermediate", 
 .     "intermediate", "results", "results", "results", "results", 
 .     "results", "results", "results", "results", "results", "results", 
 .     "results", "results", "results", "results", "results", "results", 
 .     "results", "results", "results", "results", "results", "results", 
 .     "results"), description = c("mean of normalized counts for all samples", 
 .     "variance of normalized counts for all samples", "all counts for a gene are zero", 
 .     "gene-wise estimates of dispersion", "number of iterations for gene-wise", 
 .     "fitted values of dispersion", "final estimate of dispersion", 
 .     "number of iterations", "dispersion flagged as outlier", 
 .     "maximum a posteriori estimate", "log2 fold change (MLE): Intercept", 
 .     "log2 fold change (MLE): cell type.short MEM vs B", "log2 fold change (MLE): cell type.short MONO vs B", 
 .     "log2 fold change (MLE): cell type.short NK vs B", "log2 fold change (MLE): cell type.short T vs B", 
 .     "log2 fold change (MLE): orig.ident SRA749327 SRS3693911 vs SRA749327 SRS3693910", 
 .     "log2 fold change (MLE): orig.ident SRA749327 SRS3693912 vs SRA749327 SRS3693910", 
...

I'm cropping error output so it fits 15000 characters limit.

Has anyone encountered similar issues. Is the problem with incompatibility of package versions?

sessionInfo( )

R version 4.0.3 (2020-10-10)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS

Matrix products: default
BLAS/LAPACK: /home/tatyana/anaconda3/envs/r-env-bioinfo/lib/libopenblasp-r0.3.10.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=de_AT.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=de_AT.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=de_AT.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=de_AT.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] ggpubr_0.4.0                ggrepel_0.9.1              
 [3] ggtext_0.1.1                philentropy_0.4.0          
 [5] scales_1.1.1                RColorBrewer_1.1-2         
 [7] dendsort_0.3.3              org.Rn.eg.db_3.12.0        
 [9] AnnotationDbi_1.52.0        homologene_1.4.68.19.3.27  
[11] Rncc_0.0.2.0                fitdistrplus_1.1-3         
[13] survival_3.2-7              rhdf5_2.34.0               
[15] scran_1.18.5                DESeq2_1.30.1              
[17] SingleCellExperiment_1.12.0 SummarizedExperiment_1.20.0
[19] Biobase_2.50.0              GenomicRanges_1.42.0       
[21] GenomeInfoDb_1.26.2         IRanges_2.24.1             
[23] S4Vectors_0.28.1            BiocGenerics_0.36.0        
[25] MatrixGenerics_1.2.1        matrixStats_0.58.0         
[27] glmGamPoi_1.2.0             HGNChelper_0.8.1           
[29] biomaRt_2.46.3              clusterProfiler_3.18.1     
[31] fpc_2.2-9                   wesanderson_0.3.6          
[33] MASS_7.3-53.1               dbscan_1.1-6               
[35] Matrix_1.3-2                gtools_3.8.2               
[37] pheatmap_1.0.12             forcats_0.5.1              
[39] stringr_1.4.0               dplyr_1.0.5                
[41] purrr_0.3.4                 readr_1.4.0                
[43] tidyr_1.1.3                 tibble_3.1.0               
[45] ggplot2_3.3.3               tidyverse_1.3.0            
[47] data.table_1.14.0           SeuratObject_4.0.0         
[49] Seurat_4.0.0               

loaded via a namespace (and not attached):
  [1] rappdirs_0.3.3            pbdZMQ_0.3-5             
  [3] scattermore_0.7           prabclus_2.3-2           
  [5] bit64_4.0.5               irlba_2.3.3              
  [7] DelayedArray_0.16.2       rpart_4.1-15             
  [9] RCurl_1.98-1.2            generics_0.1.0           
 [11] cowplot_1.1.1             RSQLite_2.2.3            
 [13] shadowtext_0.0.7          RANN_2.6.1               
 [15] future_1.21.0             bit_4.0.4                
 [17] enrichplot_1.10.2         spatstat.data_2.0-0      
 [19] xml2_1.3.2                lubridate_1.7.10         
 [21] httpuv_1.5.5              assertthat_0.2.1         
 [23] viridis_0.5.1             hms_1.0.0                
 [25] evaluate_0.14             promises_1.2.0.1         
 [27] DEoptimR_1.0-8            fansi_0.4.2              
 [29] progress_1.2.2            dbplyr_2.1.0             
 [31] readxl_1.3.1              igraph_1.2.6             
 [33] DBI_1.1.1                 geneplotter_1.68.0       
 [35] htmlwidgets_1.5.3         ellipsis_0.3.1           
 [37] backports_1.2.1           annotate_1.68.0          
 [39] sparseMatrixStats_1.2.1   deldir_0.2-10            
 [41] vctrs_0.3.6               ROCR_1.0-11              
 [43] abind_1.4-5               cachem_1.0.4             
 [45] withr_2.4.1               ggforce_0.3.3            
 [47] robustbase_0.93-7         sctransform_0.3.2        
 [49] prettyunits_1.1.1         mclust_5.4.7             
 [51] goftest_1.2-2             cluster_2.1.1            
 [53] DOSE_3.16.0               IRdisplay_1.0            
 [55] lazyeval_0.2.2            crayon_1.4.1             
 [57] genefilter_1.72.1         labeling_0.4.2           
 [59] edgeR_3.32.1              pkgconfig_2.0.3          
 [61] tweenr_1.0.1              nlme_3.1-152             
 [63] nnet_7.3-15               rlang_0.4.10             
 [65] globals_0.14.0            diptest_0.75-7           
 [67] lifecycle_1.0.0           miniUI_0.1.1.1           
 [69] downloader_0.4            BiocFileCache_1.14.0     
 [71] rsvd_1.0.3                modelr_0.1.8             
 [73] cellranger_1.1.0          polyclip_1.10-0          
 [75] lmtest_0.9-38             IRkernel_1.1.1           
 [77] carData_3.0-4             Rhdf5lib_1.12.1          
 [79] zoo_1.8-9                 reprex_1.0.0             
 [81] base64enc_0.1-3           ggridges_0.5.3           
 [83] png_0.1-7                 viridisLite_0.3.0        
 [85] bitops_1.0-6              rhdf5filters_1.2.0       
 [87] KernSmooth_2.23-18        DelayedMatrixStats_1.12.3
 [89] blob_1.2.1                qvalue_2.22.0            
 [91] parallelly_1.23.0         rstatix_0.7.0            
 [93] ggsignif_0.6.1            beachmat_2.6.4           
 [95] memoise_2.0.0             magrittr_2.0.1           
 [97] plyr_1.8.6                ica_1.0-2                
 [99] zlibbioc_1.36.0           compiler_4.0.3           
[101] scatterpie_0.1.5          dqrng_0.2.1              
[103] cli_2.3.1                 XVector_0.30.0           
[105] listenv_0.8.0             patchwork_1.1.1          
[107] pbapply_1.4-3             ps_1.6.0                 
[109] mgcv_1.8-34               tidyselect_1.1.0         
[111] stringi_1.5.3             GOSemSim_2.16.1          
[113] BiocSingular_1.6.0        askpass_1.1              
[115] locfit_1.5-9.4            grid_4.0.3               
[117] fastmatch_1.1-0           tools_4.0.3              
[119] rio_0.5.26                future.apply_1.7.0       
[121] rstudioapi_0.13           uuid_0.1-4               
[123] foreign_0.8-81            bluster_1.0.0            
[125] gridExtra_2.3             farver_2.1.0             
[127] Rtsne_0.15                ggraph_2.0.5             
[129] digest_0.6.27             rvcheck_0.1.8            
[131] BiocManager_1.30.10       shiny_1.6.0              
[133] gridtext_0.1.4            Rcpp_1.0.6               
[135] car_3.0-10                broom_0.7.5              
[137] scuttle_1.0.4             later_1.1.0.1            
[139] RcppAnnoy_0.0.18          httr_1.4.2               
[141] kernlab_0.9-29            colorspace_2.0-0         
[143] rvest_1.0.0               XML_3.99-0.5             
[145] fs_1.5.0                  tensor_1.5               
[147] reticulate_1.18           splines_4.0.3            
[149] statmod_1.4.35            uwot_0.1.10              
[151] spatstat.utils_2.0-0      graphlayouts_0.7.1       
[153] flexmix_2.3-17            plotly_4.9.3             
[155] xtable_1.8-4              jsonlite_1.7.2           
[157] spatstat_1.64-1           tidygraph_1.2.0          
[159] modeltools_0.2-23         R6_2.5.0                 
[161] pillar_1.5.1              htmltools_0.5.1.1        
[163] mime_0.10                 glue_1.4.2               
[165] fastmap_1.1.0             BiocParallel_1.24.1      
[167] BiocNeighbors_1.8.2       class_7.3-18             
[169] codetools_0.2-18          fgsea_1.16.0             
[171] utf8_1.1.4                lattice_0.20-41          
[173] curl_4.3                  leiden_0.3.7             
[175] zip_2.1.1                 openxlsx_4.2.3           
[177] GO.db_3.12.1              openssl_1.4.3            
[179] limma_3.46.0              repr_1.1.3               
[181] munsell_0.5.0             DO.db_2.9                
[183] GenomeInfoDbData_1.2.4    haven_2.3.1              
[185] reshape2_1.4.4            gtable_0.3.0
DESeq2 glmGamPoi scRNAseq • 369 views
ADD COMMENT
0
Entering edit mode

Just wanted to add that I am able to run glmGamPoi on its own without errors on the same dataset.

ADD REPLY
3
Entering edit mode
@mikelove
Last seen 1 hour ago
United States

I'd recommend to use minReplicates=Inf when running on single cell data.

I'll take a look -- I believe Constantin did address the outlier replacement code when integrating the new fitType -- but it looks like there is a missing piece. I may end up just ensuring that fitType="glmGamPoi" turns off outlier replacement.

ADD COMMENT
2
Entering edit mode

I believe we just want to turn off outlier replacement altogether when we use glmGamPoi so I've made that change in devel:

https://github.com/mikelove/DESeq2/commit/5a32b40b563e7db35fc3dafed705269695ae2dca

ADD REPLY
0
Entering edit mode

That works! Thanks a lot!

ADD REPLY

Login before adding your answer.

Traffic: 500 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6