I'm trying to use DESeq2 with glmGamPoi as a fit type for scRNAseq data.
If I run:
deg <- DESeq(dds, fitType="glmGamPoi")
I get a warning that glmGamPoi dispersion estimator should be used in combination with a LRT and not a Wald test.
, but it runs through.
Then If I run:
deg <- DESeq(dds, fitType="glmGamPoi", test = 'LRT', reduced = ~ orig.ident)
I get an error:
estimating size factors
estimating dispersions
gene-wise dispersion estimates
using 'glmGamPoi' as fitType. If used in published research, please cite:
Ahlmann-Eltze, C., Huber, W. (2020) glmGamPoi: Fitting Gamma-Poisson
Generalized Linear Models on Single Cell Count Data. bioRxiv.
https://doi.org/10.1101/2020.08.13.249623
mean-dispersion relationship
final dispersion estimates
fitting model and testing
Fit reduced model
Calculate quasi likelihood ratio
Preprare results
Error in replaceOutliers(object, minReplicates = minReplicatesForReplace): first run DESeq, nbinomWaldTest, or nbinomLRT to identify outliers
Traceback:
1. DESeq(dds, fitType = "glmGamPoi", test = "LRT", reduced = ~orig.ident)
2. refitWithoutOutliers(object, test = test, betaPrior = betaPrior,
. full = full, reduced = reduced, quiet = quiet, minReplicatesForReplace = minReplicatesForReplace,
. modelMatrix = modelMatrix, modelMatrixType = modelMatrixType)
3. replaceOutliers(object, minReplicates = minReplicatesForReplace)
4. stop("first run DESeq, nbinomWaldTest, or nbinomLRT to identify outliers")
Following the instruction I run:
dds <- estimateSizeFactors(dds)
dds <- estimateDispersions(dds, fitType = "glmGamPoi")
dds <- nbinomLRT(dds, reduced = ~ orig.ident)
dds <- replaceOutliers(dds)
It all goes through, but when I try to run DESeq again:
deg <- DESeq(dds, fitType="glmGamPoi", test = 'LRT', reduced = ~ orig.ident)
I get a different error:
using pre-existing size factors
estimating dispersions
found already estimated dispersions, replacing these
gene-wise dispersion estimates
using 'glmGamPoi' as fitType. If used in published research, please cite:
Ahlmann-Eltze, C., Huber, W. (2020) glmGamPoi: Fitting Gamma-Poisson
Generalized Linear Models on Single Cell Count Data. bioRxiv.
https://doi.org/10.1101/2020.08.13.249623
mean-dispersion relationship
final dispersion estimates
fitting model and testing
Fit reduced model
Calculate quasi likelihood ratio
Preprare results
-- replacing outliers and refitting for 2 genes
-- DESeq argument 'minReplicatesForReplace' = 7
-- original counts are preserved in counts(dds)
estimating dispersions
fitting model and testing
Error: subscript contains NAs
Traceback:
1. DESeq(dds, fitType = "glmGamPoi", test = "LRT", reduced = ~orig.ident)
2. refitWithoutOutliers(object, test = test, betaPrior = betaPrior,
. full = full, reduced = reduced, quiet = quiet, minReplicatesForReplace = minReplicatesForReplace,
. modelMatrix = modelMatrix, modelMatrixType = modelMatrixType)
3. `[<-`(`*tmp*`, refitReplace, idx, value = new("DFrame", rownames = c("HBA2",
. "HBB"), nrows = 2L, listData = list(baseMean = c(0.761643866520817,
. 1.47541029264753), baseVar = c(4.4774987603094, 7.79527342821179
. ), allZero = c(FALSE, FALSE), dispGeneEst = c(0.733618591623128,
. 0.332659026383691), dispGeneIter = c(15, 17), dispFit = c(0.148263041023227,
. 0.146465471245793), dispersion = c(0.733618591623128, 0.332659026383691
. ), dispIter = c(14L, 14L), dispOutlier = c(TRUE, TRUE), dispMAP = c(0.701974657518914,
. 0.325841413372872), Intercept = c(-2.43305333148164, -2.27831949214498
. ), cell_type.short_MEM_vs_B = c(-0.382430418117617, -0.526982820621525
. ), cell_type.short_MONO_vs_B = c(0.527268465633085, 0.355888820362956
. ), cell_type.short_NK_vs_B = c(-0.318913293254477, -0.370547582526146
. ), cell_type.short_T_vs_B = c(0.0316404680549185, -0.23978329864071
. ), orig.ident_SRA749327_SRS3693911_vs_SRA749327_SRS3693910 = c(3.29914487107158,
. 4.15497858516832), orig.ident_SRA749327_SRS3693912_vs_SRA749327_SRS3693910 = c(0.360272275995769,
. 0.901159079661736), orig.ident_SRA749327_SRS3693913_vs_SRA749327_SRS3693910 = c(2.74462008252994,
. 3.64340565684488), SE_Intercept = c(0.15940549992706, 0.126830571258761
. ), SE_cell_type.short_MEM_vs_B = c(0.162153080575308, 0.121916400144907
. ), SE_cell_type.short_MONO_vs_B = c(0.234279980801597, 0.203037672234189
. ), SE_cell_type.short_NK_vs_B = c(0.12146183320674, 0.085550421230657
. ), SE_cell_type.short_T_vs_B = c(0.130881978367645, 0.09306215532191
. ), SE_orig.ident_SRA749327_SRS3693911_vs_SRA749327_SRS3693910 = c(0.127329079872187,
. 0.103591219573552), SE_orig.ident_SRA749327_SRS3693912_vs_SRA749327_SRS3693910 = c(0.220296188544785,
. 0.186780678209572), SE_orig.ident_SRA749327_SRS3693913_vs_SRA749327_SRS3693910 = c(0.142766817035964,
. 0.116320005758324), LRTStatistic = c(19.987838854835, 28.4351341414704
. ), LRTPvalue = c(0.000502167370342464, 1.01796840681639e-05),
. fullBetaConv = c(TRUE, TRUE), reducedBetaConv = c(TRUE, TRUE
. ), betaIter = c(6, 5), deviance = c(5242.07417197948, 6683.64669011397
. ), maxCooks = c(HBA2 = 0.816117734415004, HBB = 0.450463054748042
. )), elementType = "ANY", elementMetadata = new("DFrame",
. rownames = NULL, nrows = 33L, listData = list(type = c("intermediate",
. "intermediate", "intermediate", "intermediate", "intermediate",
. "intermediate", "intermediate", "intermediate", "intermediate",
. "intermediate", "results", "results", "results", "results",
. "results", "results", "results", "results", "results", "results",
. "results", "results", "results", "results", "results", "results",
. "results", "results", "results", "results", "results", "results",
. "results"), description = c("mean of normalized counts for all samples",
. "variance of normalized counts for all samples", "all counts for a gene are zero",
. "gene-wise estimates of dispersion", "number of iterations for gene-wise",
. "fitted values of dispersion", "final estimate of dispersion",
. "number of iterations", "dispersion flagged as outlier",
. "maximum a posteriori estimate", "log2 fold change (MLE): Intercept",
. "log2 fold change (MLE): cell type.short MEM vs B", "log2 fold change (MLE): cell type.short MONO vs B",
. "log2 fold change (MLE): cell type.short NK vs B", "log2 fold change (MLE): cell type.short T vs B",
. "log2 fold change (MLE): orig.ident SRA749327 SRS3693911 vs SRA749327 SRS3693910",
. "log2 fold change (MLE): orig.ident SRA749327 SRS3693912 vs SRA749327 SRS3693910",
...
I'm cropping error output so it fits 15000 characters limit.
Has anyone encountered similar issues. Is the problem with incompatibility of package versions?
sessionInfo( )
R version 4.0.3 (2020-10-10)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS
Matrix products: default
BLAS/LAPACK: /home/tatyana/anaconda3/envs/r-env-bioinfo/lib/libopenblasp-r0.3.10.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=de_AT.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=de_AT.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=de_AT.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_AT.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] ggpubr_0.4.0 ggrepel_0.9.1
[3] ggtext_0.1.1 philentropy_0.4.0
[5] scales_1.1.1 RColorBrewer_1.1-2
[7] dendsort_0.3.3 org.Rn.eg.db_3.12.0
[9] AnnotationDbi_1.52.0 homologene_1.4.68.19.3.27
[11] Rncc_0.0.2.0 fitdistrplus_1.1-3
[13] survival_3.2-7 rhdf5_2.34.0
[15] scran_1.18.5 DESeq2_1.30.1
[17] SingleCellExperiment_1.12.0 SummarizedExperiment_1.20.0
[19] Biobase_2.50.0 GenomicRanges_1.42.0
[21] GenomeInfoDb_1.26.2 IRanges_2.24.1
[23] S4Vectors_0.28.1 BiocGenerics_0.36.0
[25] MatrixGenerics_1.2.1 matrixStats_0.58.0
[27] glmGamPoi_1.2.0 HGNChelper_0.8.1
[29] biomaRt_2.46.3 clusterProfiler_3.18.1
[31] fpc_2.2-9 wesanderson_0.3.6
[33] MASS_7.3-53.1 dbscan_1.1-6
[35] Matrix_1.3-2 gtools_3.8.2
[37] pheatmap_1.0.12 forcats_0.5.1
[39] stringr_1.4.0 dplyr_1.0.5
[41] purrr_0.3.4 readr_1.4.0
[43] tidyr_1.1.3 tibble_3.1.0
[45] ggplot2_3.3.3 tidyverse_1.3.0
[47] data.table_1.14.0 SeuratObject_4.0.0
[49] Seurat_4.0.0
loaded via a namespace (and not attached):
[1] rappdirs_0.3.3 pbdZMQ_0.3-5
[3] scattermore_0.7 prabclus_2.3-2
[5] bit64_4.0.5 irlba_2.3.3
[7] DelayedArray_0.16.2 rpart_4.1-15
[9] RCurl_1.98-1.2 generics_0.1.0
[11] cowplot_1.1.1 RSQLite_2.2.3
[13] shadowtext_0.0.7 RANN_2.6.1
[15] future_1.21.0 bit_4.0.4
[17] enrichplot_1.10.2 spatstat.data_2.0-0
[19] xml2_1.3.2 lubridate_1.7.10
[21] httpuv_1.5.5 assertthat_0.2.1
[23] viridis_0.5.1 hms_1.0.0
[25] evaluate_0.14 promises_1.2.0.1
[27] DEoptimR_1.0-8 fansi_0.4.2
[29] progress_1.2.2 dbplyr_2.1.0
[31] readxl_1.3.1 igraph_1.2.6
[33] DBI_1.1.1 geneplotter_1.68.0
[35] htmlwidgets_1.5.3 ellipsis_0.3.1
[37] backports_1.2.1 annotate_1.68.0
[39] sparseMatrixStats_1.2.1 deldir_0.2-10
[41] vctrs_0.3.6 ROCR_1.0-11
[43] abind_1.4-5 cachem_1.0.4
[45] withr_2.4.1 ggforce_0.3.3
[47] robustbase_0.93-7 sctransform_0.3.2
[49] prettyunits_1.1.1 mclust_5.4.7
[51] goftest_1.2-2 cluster_2.1.1
[53] DOSE_3.16.0 IRdisplay_1.0
[55] lazyeval_0.2.2 crayon_1.4.1
[57] genefilter_1.72.1 labeling_0.4.2
[59] edgeR_3.32.1 pkgconfig_2.0.3
[61] tweenr_1.0.1 nlme_3.1-152
[63] nnet_7.3-15 rlang_0.4.10
[65] globals_0.14.0 diptest_0.75-7
[67] lifecycle_1.0.0 miniUI_0.1.1.1
[69] downloader_0.4 BiocFileCache_1.14.0
[71] rsvd_1.0.3 modelr_0.1.8
[73] cellranger_1.1.0 polyclip_1.10-0
[75] lmtest_0.9-38 IRkernel_1.1.1
[77] carData_3.0-4 Rhdf5lib_1.12.1
[79] zoo_1.8-9 reprex_1.0.0
[81] base64enc_0.1-3 ggridges_0.5.3
[83] png_0.1-7 viridisLite_0.3.0
[85] bitops_1.0-6 rhdf5filters_1.2.0
[87] KernSmooth_2.23-18 DelayedMatrixStats_1.12.3
[89] blob_1.2.1 qvalue_2.22.0
[91] parallelly_1.23.0 rstatix_0.7.0
[93] ggsignif_0.6.1 beachmat_2.6.4
[95] memoise_2.0.0 magrittr_2.0.1
[97] plyr_1.8.6 ica_1.0-2
[99] zlibbioc_1.36.0 compiler_4.0.3
[101] scatterpie_0.1.5 dqrng_0.2.1
[103] cli_2.3.1 XVector_0.30.0
[105] listenv_0.8.0 patchwork_1.1.1
[107] pbapply_1.4-3 ps_1.6.0
[109] mgcv_1.8-34 tidyselect_1.1.0
[111] stringi_1.5.3 GOSemSim_2.16.1
[113] BiocSingular_1.6.0 askpass_1.1
[115] locfit_1.5-9.4 grid_4.0.3
[117] fastmatch_1.1-0 tools_4.0.3
[119] rio_0.5.26 future.apply_1.7.0
[121] rstudioapi_0.13 uuid_0.1-4
[123] foreign_0.8-81 bluster_1.0.0
[125] gridExtra_2.3 farver_2.1.0
[127] Rtsne_0.15 ggraph_2.0.5
[129] digest_0.6.27 rvcheck_0.1.8
[131] BiocManager_1.30.10 shiny_1.6.0
[133] gridtext_0.1.4 Rcpp_1.0.6
[135] car_3.0-10 broom_0.7.5
[137] scuttle_1.0.4 later_1.1.0.1
[139] RcppAnnoy_0.0.18 httr_1.4.2
[141] kernlab_0.9-29 colorspace_2.0-0
[143] rvest_1.0.0 XML_3.99-0.5
[145] fs_1.5.0 tensor_1.5
[147] reticulate_1.18 splines_4.0.3
[149] statmod_1.4.35 uwot_0.1.10
[151] spatstat.utils_2.0-0 graphlayouts_0.7.1
[153] flexmix_2.3-17 plotly_4.9.3
[155] xtable_1.8-4 jsonlite_1.7.2
[157] spatstat_1.64-1 tidygraph_1.2.0
[159] modeltools_0.2-23 R6_2.5.0
[161] pillar_1.5.1 htmltools_0.5.1.1
[163] mime_0.10 glue_1.4.2
[165] fastmap_1.1.0 BiocParallel_1.24.1
[167] BiocNeighbors_1.8.2 class_7.3-18
[169] codetools_0.2-18 fgsea_1.16.0
[171] utf8_1.1.4 lattice_0.20-41
[173] curl_4.3 leiden_0.3.7
[175] zip_2.1.1 openxlsx_4.2.3
[177] GO.db_3.12.1 openssl_1.4.3
[179] limma_3.46.0 repr_1.1.3
[181] munsell_0.5.0 DO.db_2.9
[183] GenomeInfoDbData_1.2.4 haven_2.3.1
[185] reshape2_1.4.4 gtable_0.3.0
Just wanted to add that I am able to run glmGamPoi on its own without errors on the same dataset.