Dispersion Factors Not Being Calculated For Every Gene
1
0
Entering edit mode
rohitghosh • 0
@eadfcf90
Last seen 2.8 years ago
United States

Hello,

This may be a trivial question, but I am trying to use DESeq2 to find dispersion estimates with a reduced model (using no covariates). My dataset has 15400 genes with non-zero expression, but when I use the following commands, "environment(dds@dispersionFunction)[["fit"]][["fitted.values"]]" only has only 14116 values:


>  dds <- estimateSizeFactors(dds)
>  dds <- estimateDispersions(dds)

sessionInfo( )

R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ggplot2_3.3.3               pasilla_1.14.0              DESeq2_1.26.0              
 [4] SummarizedExperiment_1.16.1 DelayedArray_0.12.3         BiocParallel_1.20.1        
 [7] matrixStats_0.58.0          Biobase_2.46.0              GenomicRanges_1.38.0       
[10] GenomeInfoDb_1.22.1         IRanges_2.20.2              S4Vectors_0.24.4           
[13] BiocGenerics_0.32.0         ImpulseDE2_1.10.0          

loaded via a namespace (and not attached):
 [1] bitops_1.0-7           bit64_4.0.5            RColorBrewer_1.1-2     tools_3.6.3           
 [5] backports_1.2.1        utf8_1.2.1             R6_2.5.0               rpart_4.1-15          
 [9] Hmisc_4.5-0            DBI_1.1.1              colorspace_2.0-1       nnet_7.3-13           
[13] GetoptLong_1.0.5       withr_2.4.2            tidyselect_1.1.1       gridExtra_2.3         
[17] bit_4.0.4              compiler_3.6.3         htmlTable_2.1.0        labeling_0.4.2        
[21] scales_1.1.1           checkmate_2.0.0        genefilter_1.68.0      stringr_1.4.0         
[25] digest_0.6.27          foreign_0.8-75         XVector_0.26.0         base64enc_0.1-3       
[29] jpeg_0.1-8.1           pkgconfig_2.0.3        htmltools_0.5.1.1      fastmap_1.1.0         
[33] htmlwidgets_1.5.3      rlang_0.4.11           GlobalOptions_0.1.2    rstudioapi_0.13       
[37] RSQLite_2.2.7          shape_1.4.5            generics_0.1.0         farver_2.1.0          
[41] dplyr_1.0.6            RCurl_1.98-1.3         magrittr_2.0.1         GenomeInfoDbData_1.2.2
[45] Formula_1.2-4          Matrix_1.2-18          Rcpp_1.0.6             munsell_0.5.0         
[49] fansi_0.4.2            lifecycle_1.0.0        stringi_1.5.3          zlibbioc_1.32.0       
[53] grid_3.6.3             blob_1.2.1             crayon_1.4.1           lattice_0.20-40       
[57] cowplot_1.1.1          splines_3.6.3          annotate_1.64.0        circlize_0.4.12       
[61] locfit_1.5-9.4         knitr_1.33             ComplexHeatmap_2.2.0   pillar_1.6.0          
[65] rjson_0.2.20           geneplotter_1.64.0     XML_3.99-0.3           glue_1.4.2            
[69] latticeExtra_0.6-29    data.table_1.14.0      png_0.1-7              vctrs_0.3.8           
[73] gtable_0.3.0           purrr_0.3.4            clue_0.3-59            cachem_1.0.4          
[77] xfun_0.22              xtable_1.8-4           survival_3.1-8         tibble_3.1.1          
[81] AnnotationDbi_1.48.0   memoise_2.0.0          cluster_2.1.0          ellipsis_0.3.2

For context, I am using these dispersion estimates as input for running ImpulseDE2 to identify time-dependent genes using time series RNA-seq data with only one sample per time point. I've emailed the people who wrote ImpulseDE2 and they told me this was possible by first running DESeq2 with a reduced model.

Are some genes not able to be used for calculating dispersion estimates? If so, is there a way for me to identify which genes these are?

Thanks!

DESeq2 • 363 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 41 minutes ago
United States

See the vignette section "Access to all calculated values", there are cleaner ways to get at the internal calculated values than using environment.

You can just take a look at the mcols DataFrame to get a better idea I think.

ADD COMMENT

Login before adding your answer.

Traffic: 1033 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6