Dispersion Factors Not Being Calculated For Every Gene
Entering edit mode
rohitghosh • 0
Last seen 20 days ago
United States


This may be a trivial question, but I am trying to use DESeq2 to find dispersion estimates with a reduced model (using no covariates). My dataset has 15400 genes with non-zero expression, but when I use the following commands, "environment(dds@dispersionFunction)[["fit"]][["fitted.values"]]" only has only 14116 values:

>  dds <- estimateSizeFactors(dds)
>  dds <- estimateDispersions(dds)

sessionInfo( )

R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ggplot2_3.3.3               pasilla_1.14.0              DESeq2_1.26.0              
 [4] SummarizedExperiment_1.16.1 DelayedArray_0.12.3         BiocParallel_1.20.1        
 [7] matrixStats_0.58.0          Biobase_2.46.0              GenomicRanges_1.38.0       
[10] GenomeInfoDb_1.22.1         IRanges_2.20.2              S4Vectors_0.24.4           
[13] BiocGenerics_0.32.0         ImpulseDE2_1.10.0          

loaded via a namespace (and not attached):
 [1] bitops_1.0-7           bit64_4.0.5            RColorBrewer_1.1-2     tools_3.6.3           
 [5] backports_1.2.1        utf8_1.2.1             R6_2.5.0               rpart_4.1-15          
 [9] Hmisc_4.5-0            DBI_1.1.1              colorspace_2.0-1       nnet_7.3-13           
[13] GetoptLong_1.0.5       withr_2.4.2            tidyselect_1.1.1       gridExtra_2.3         
[17] bit_4.0.4              compiler_3.6.3         htmlTable_2.1.0        labeling_0.4.2        
[21] scales_1.1.1           checkmate_2.0.0        genefilter_1.68.0      stringr_1.4.0         
[25] digest_0.6.27          foreign_0.8-75         XVector_0.26.0         base64enc_0.1-3       
[29] jpeg_0.1-8.1           pkgconfig_2.0.3        htmltools_0.5.1.1      fastmap_1.1.0         
[33] htmlwidgets_1.5.3      rlang_0.4.11           GlobalOptions_0.1.2    rstudioapi_0.13       
[37] RSQLite_2.2.7          shape_1.4.5            generics_0.1.0         farver_2.1.0          
[41] dplyr_1.0.6            RCurl_1.98-1.3         magrittr_2.0.1         GenomeInfoDbData_1.2.2
[45] Formula_1.2-4          Matrix_1.2-18          Rcpp_1.0.6             munsell_0.5.0         
[49] fansi_0.4.2            lifecycle_1.0.0        stringi_1.5.3          zlibbioc_1.32.0       
[53] grid_3.6.3             blob_1.2.1             crayon_1.4.1           lattice_0.20-40       
[57] cowplot_1.1.1          splines_3.6.3          annotate_1.64.0        circlize_0.4.12       
[61] locfit_1.5-9.4         knitr_1.33             ComplexHeatmap_2.2.0   pillar_1.6.0          
[65] rjson_0.2.20           geneplotter_1.64.0     XML_3.99-0.3           glue_1.4.2            
[69] latticeExtra_0.6-29    data.table_1.14.0      png_0.1-7              vctrs_0.3.8           
[73] gtable_0.3.0           purrr_0.3.4            clue_0.3-59            cachem_1.0.4          
[77] xfun_0.22              xtable_1.8-4           survival_3.1-8         tibble_3.1.1          
[81] AnnotationDbi_1.48.0   memoise_2.0.0          cluster_2.1.0          ellipsis_0.3.2

For context, I am using these dispersion estimates as input for running ImpulseDE2 to identify time-dependent genes using time series RNA-seq data with only one sample per time point. I've emailed the people who wrote ImpulseDE2 and they told me this was possible by first running DESeq2 with a reduced model.

Are some genes not able to be used for calculating dispersion estimates? If so, is there a way for me to identify which genes these are?


DESeq2 • 94 views
Entering edit mode
Last seen 9 hours ago
United States

See the vignette section "Access to all calculated values", there are cleaner ways to get at the internal calculated values than using environment.

You can just take a look at the mcols DataFrame to get a better idea I think.


Login before adding your answer.

Traffic: 454 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6