Search
Question: DESEq2 Error in rownames<-(*tmp*, value = names(x))
0
gravatar for ashley.doane
10 weeks ago by
ashley.doane20
United States
ashley.doane20 wrote:

Hi,

Getting an unexpected error with DESeq2.  

> dds = DESeq(dds)
using pre-existing normalization factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing
-- replacing outliers and refitting for 792 genes
-- DESeq argument 'minReplicatesForReplace' = 7 
-- original counts are preserved in counts(dds)
estimating dispersions
Error in `rownames<-`(`*tmp*`, value = names(x)) : 
  duplicate rownames not allowed

Of course I checked rownames are unique:

> rn = rownames(dds.ed)
> rn[duplicated(rn)]
character(0)

Also tried setting new rownames like rownames(dds) = 1:length(dds), but I still get this error.

I've tried installing the binary for OSX and compiling source, and same result.

I must be missing something obvious. Any ideas?

thanks,

Ashley

> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS  10.14

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
 [1] splines   parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] cqn_1.26.0                  quantreg_5.36               SparseM_1.77                preprocessCore_1.42.0       nor1mix_1.2-3              
 [6] mclust_5.4.1                edgeR_3.22.5                limma_3.36.5                DESeq2_1.20.0               SummarizedExperiment_1.10.1
[11] DelayedArray_0.6.6          BiocParallel_1.14.2         matrixStats_0.54.0          Biobase_2.40.0              forcats_0.3.0              
[16] dplyr_0.7.6                 purrr_0.2.5                 tidyr_0.8.1                 tibble_1.4.2                ggplot2_3.0.0.9000         
[21] tidyverse_1.2.1             readr_1.1.1                 stringr_1.3.1               rtracklayer_1.40.6          GenomicRanges_1.32.7       
[26] GenomeInfoDb_1.16.0         IRanges_2.14.12             S4Vectors_0.18.3            BiocGenerics_0.26.0         BiocInstaller_1.30.0       

loaded via a namespace (and not attached):
 [1] colorspace_1.3-2         htmlTable_1.12           XVector_0.20.0           base64enc_0.1-3          rstudioapi_0.8           MatrixModels_0.4-1      
 [7] bit64_0.9-7              AnnotationDbi_1.42.1     lubridate_1.7.4          xml2_1.2.0               geneplotter_1.58.0       knitr_1.20              
[13] Formula_1.2-3            jsonlite_1.5             Rsamtools_1.32.3         broom_0.5.0              annotate_1.58.0          cluster_2.0.7-1         
[19] compiler_3.5.1           httr_1.3.1               backports_1.1.2          assertthat_0.2.0         Matrix_1.2-14            lazyeval_0.2.1          
[25] cli_1.0.1                acepack_1.4.1            htmltools_0.3.6          tools_3.5.1              bindrcpp_0.2.2           gtable_0.2.0            
[31] glue_1.3.0               GenomeInfoDbData_1.1.0   Rcpp_0.12.19             cellranger_1.1.0         Biostrings_2.48.0        nlme_3.1-137            
[37] rvest_0.3.2              XML_3.98-1.16            zlibbioc_1.26.0          scales_1.0.0             hms_0.4.2                RColorBrewer_1.1-2      
[43] yaml_2.2.0               memoise_1.1.0            gridExtra_2.3            rpart_4.1-13             latticeExtra_0.6-28      stringi_1.2.4           
[49] RSQLite_2.1.1            genefilter_1.62.0        checkmate_1.8.5          rlang_0.2.2              pkgconfig_2.0.2          bitops_1.0-6            
[55] lattice_0.20-35          bindr_0.1.1              GenomicAlignments_1.16.0 htmlwidgets_1.3          bit_1.1-14               tidyselect_0.2.4        
[61] plyr_1.8.4               magrittr_1.5             R6_2.3.0                 Hmisc_4.1-1              DBI_1.0.0                pillar_1.3.0            
[67] haven_1.1.2              foreign_0.8-71           withr_2.1.2              survival_2.42-6          RCurl_1.95-4.11          nnet_7.3-12             
[73] modelr_0.1.2             crayon_1.3.4             locfit_1.5-9.1           grid_3.5.1               readxl_1.1.0             data.table_1.11.8       
[79] blob_1.1.1               digest_0.6.17            xtable_1.8-3             munsell_0.5.0
ADD COMMENTlink modified 9 weeks ago by Michael Love20k • written 10 weeks ago by ashley.doane20
1
gravatar for Michael Love
10 weeks ago by
Michael Love20k
United States
Michael Love20k wrote:

What happens with DESeq(dds, minRep=Inf)?

ADD COMMENTlink written 10 weeks ago by Michael Love20k
Thanks, this solves the immediate issue.  I dont think it would matter, but failed to mention it was a large counts matrix by number of rows, as there were 103,000 "genes" (ATACseq peaks). Please let me know if I can provide additional information. And also, thanks so much for DESeq2 and for continuiing it's development. Best, Ashley    
ADD REPLYlink written 10 weeks ago by ashley.doane20

I’m not sure if I can figure out what’s going on because it doesn’t throw this error in our tests. I’ll take a look at the code, but may not find the issue.

I’d say you can also just assess outlier by eye with a few example of peaks with large value of maxCooks in mcols(dds), rather than using the outlier replacement heuristic.

ADD REPLYlink written 10 weeks ago by Michael Love20k

Can you show mcols(dds) before you run DESeq()? Are there any additional columns there?

ADD REPLYlink written 10 weeks ago by Michael Love20k

Hello,

Just wanted to add that I had the same issue: https://www.biostars.org/p/343037/

Setting minRep=Inf also fixed the problem for me, and it does look like I had a few outliers in the post-DESeq dds.

When I tried to graph outliers in the pre-DESeq dds using the method described in that post (bottom), I got this error:

Error in apply(assays(dds_kal_agg)[["cooks"]], 1, max) : 
  dim(X) must have a positive length

Hope this helps.

Kristin

(edit - realizing that the error message is because Cooks has not been calculated for dds_kal_agg, being pre-DESeq - is there another feature of mcols I should check out? It looks pretty empty:)

ADD REPLYlink modified 9 weeks ago • written 9 weeks ago by muench.kristin0

Can you send me the dds to maintainer(“DESeq2”) ? And I’ll try to hunt down the bug.

ADD REPLYlink written 9 weeks ago by Michael Love20k

Thank you, I was able to reproduce with v1.20.

The problem is that you have duplicate columns of colData(dds), which breaks some code where replaceOutliers adds a column to colData(dds) and adds some metadata about that column.

sum(duplicated(colnames(colData(dds))))

"Line" and "DESeqAnalysisID" columns both have duplicates.

So a solution is to only have unique column names for colData(dds), which is probably a good idea anyway.

I noticed that the error isn't thrown anway in the development version, which will be released in a few weeks as v1.22.

 

ADD REPLYlink written 9 weeks ago by Michael Love20k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 197 users visited in the last hour