Question

Running the DESeq2 "DESeqDataSetFromMatrix" function returns a dimname error

0

Entering edit mode

RDoc ▴ 10

@800f5cc3

Last seen 2.8 years ago

Sweden

I am trying to follow a vignette for finding enriched pathways. As part of that pipeline, I would like to calculate the differentially expressed genes using DESeq2. Initially, I am working with a Seurat Object, as such I run the following code to achieve what I would like to do:

exp_mtx <- seurat_exp_mtx@assays$RNA@counts # To generate the raw counts from the Seurat Object

exp_design <- as.data.frame(matrix(NA,length(colnames(exp_mtx)),1))
names(exp_design) <- c("cluster")
row.names(exp_design) <- seurat_exp_mtx@meta.data$cells. 
exp_design$cluster <- seurat_exp_mtx@meta.data$clusters # To generate the colData

dds_cortex <- DESeqDataSetFromMatrix(countData = as.matrix(exp_mtx), 
                     colData = exp_design, design = ~ cluster)

However, when trying to do so, I run into the following error:

"Error in `assays<-`(`*tmp*`, withDimnames = withDimnames, ..., value = `*vtmp*`) : please use 'assay(x, withDimnames=FALSE)) <- value' or 'assays(x, withDimnames=FALSE)) <- value' when the dimnames on the supplied assay(s) are not identical to the dimnames on RangedSummarizedExperiment object 'x'"

I checked the dimnames for both the exp_mtx and exp_design and noticed that one was named only 1 and 2, whereas the other was named cells and features and as such, I ran the following:

names(dimnames(exp_design)) <- c("cells", "cluster")

Unfortunately, it had no effect. I have checked and the CellIDs are the same. This is also to be expected considering that they were both extracted from the same initial file. I have also checked the documentation for the function to check that I've made the colData correctly, and it states "for matrix input: a DataFrame or data.frame with at least a single column. Rows of colData correspond to columns of countData" (https://www.rdocumentation.org/packages/DESeq2/versions/1.12.3/topics/DESeqDataSet-class) which is what I have.

I really have no idea about what it is that is causing the issue. Any suggestions would be greatly appreciated.

> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
 [1] grid      parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] fgsea_1.16.0                VennDiagram_1.6.20          futile.logger_1.4.3         tibble_3.1.2                dplyr_1.0.7                 DESeq2_1.30.1              
 [7] SummarizedExperiment_1.20.0 Biobase_2.50.0              MatrixGenerics_1.2.1        matrixStats_0.59.0          GenomicRanges_1.42.0        GenomeInfoDb_1.26.7        
[13] IRanges_2.24.1              S4Vectors_0.28.1            BiocGenerics_0.36.1         SeuratObject_4.0.2         

loaded via a namespace (and not attached):
 [1] httr_1.4.2             pkgload_1.2.1          bit64_4.0.5            splines_4.0.3          assertthat_0.2.1       BiocManager_1.30.16    blob_1.2.1             GenomeInfoDbData_1.2.4
 [9] yaml_2.2.1             remotes_2.4.0          sessioninfo_1.1.1      pillar_1.6.1           RSQLite_2.2.7          lattice_0.20-44        glue_1.4.2             RColorBrewer_1.1-2    
[17] XVector_0.30.0         colorspace_2.0-2       Matrix_1.3-4           XML_3.99-0.6           pkgconfig_2.0.3        devtools_2.4.2         genefilter_1.72.1      zlibbioc_1.36.0       
[25] purrr_0.3.4            xtable_1.8-4           scales_1.1.1           processx_3.5.2         BiocParallel_1.24.1    annotate_1.68.0        generics_0.1.0         ggplot2_3.3.5         
[33] usethis_2.0.1          ellipsis_0.3.2         cachem_1.0.5           withr_2.4.2            cli_3.0.1              survival_3.2-11        magrittr_2.0.1         crayon_1.4.1          
[41] memoise_2.0.0          ps_1.6.0               fs_1.5.0               fansi_0.5.0            pkgbuild_1.2.0         data.table_1.14.0      tools_4.0.3            prettyunits_1.1.1     
[49] formatR_1.11           lifecycle_1.0.0        munsell_0.5.0          locfit_1.5-9.4         DelayedArray_0.16.3    lambda.r_1.2.4         AnnotationDbi_1.52.0   callr_3.7.0           
[57] compiler_4.0.3         rlang_0.4.11           RCurl_1.98-1.3         bitops_1.0-7           testthat_3.0.4         gtable_0.3.0           DBI_1.1.1              R6_2.5.0              
[65] gridExtra_2.3          fastmap_1.1.0          bit_4.0.4              utf8_1.2.1             fastmatch_1.1-0        rprojroot_2.0.2        futile.options_1.0.1   desc_1.3.0            
[73] Rcpp_1.0.7             vctrs_0.3.8            geneplotter_1.68.0     tidyselect_1.1.1

DESeq2 Scrna scRNAseq • 2.3k views

ADD COMMENT • link 2.8 years ago RDoc ▴ 10

0

Entering edit mode

Why don't you show the heads of your counts and coldata?

ADD REPLY • link 2.8 years ago swbarnes2 ★ 1.3k

0

Entering edit mode

I could do that, my dataset is quite large so it omits alot.

head(exp_design) yields:

                         cluster
10X01_1_AAACTTGACGTTAG-1      19
10X01_1_AAACTTGACTCCAC-1      19
10X01_1_AACAAACTAAGAGT-1      19
10X01_1_AACAGAGAATGTCG-1       2
10X01_1_AACAGAGACCTCGT-1      19
10X01_1_AACAGCACCCTCGT-1      19

Whereas as.matrix(exp_mtx[1:5, 1:5] yields:

         cells
features  10X01_1_AAACTTGACGTTAG-1 10X01_1_AAACTTGACTCCAC-1 10X01_1_AACAAACTAAGAGT-1 10X01_1_AACAGAGAATGTCG-1 10X01_1_AACAGAGACCTCGT-1
  Xkr4                           1                        0                        1                        2                        0
  Gm37381                        0                        0                        0                        0                        0
  Rp1                            0                        0                        0                        0                        0
  Sox17                          0                        0                        0                        0                        0
  Gm37323                        0                        0                        0                        0                        0
`

I can also add that I've tried identical(row.names(exp_design), colnames(cortex_mtx))and that returned

[1] TRUE

ADD REPLY • link 2.8 years ago RDoc ▴ 10

1

Entering edit mode

Do you have a plan for size normalization? DEseq's method is not going to be happy with so many genes containing a zero. Also, your exp_mtx doesn't look right. "features" and "cells" should not be in there. Also, sometimes R has a problem with things that start with numbers, you might try prepending an "X" to your cell IDs

ADD REPLY • link 2.8 years ago swbarnes2 ★ 1.3k

0

Entering edit mode

See the 2nd, 3rd and 4th bullets here

https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#recommendations-for-single-cell-analysis

The 1st bullet concerns ZI modeling which is not always needed, I think with UMI it’s not often performed now.

ADD REPLY • link 2.8 years ago Michael Love 41k

0

Entering edit mode

Thanks for pointing out that there is something wrong with the appearance of the exp_mtx, it seemed to be the root cause of my issue. I'm well aware of the problem with the geometric means. In my case, I changed the sftype to 'poscount' to test it. I'm mainly interested in getting everything to work in an exploratory fashion so it's acceptable if everything isn't optimized from the start.

ADD REPLY • link 2.8 years ago RDoc ▴ 10

score 0 · Answer 1 · 2021-07-20

0

Entering edit mode

Michael Love 41k

@mikelove

Last seen 7 hours ago

United States

I think this happens with a version mix of Bioc packages.

Try BiocManager::valid()

ADD COMMENT • link 2.8 years ago Michael Love 41k

0

Entering edit mode

I have tried that as well, it only returned "TRUE":

BiocManager::valid()
'getOption("repos")' replaces Bioconductor standard repositories, see '?repositories' for details

replacement repositories:
    CRAN: https://cran.rstudio.com/

[1] TRUE

ADD REPLY • link 2.8 years ago RDoc ▴ 10

0

Entering edit mode

Can you test this:

devtools::install_github("mikelove/DESeq2")

this will mix up your versions, but I want to see if the latest version resolves something.

ADD REPLY • link 2.8 years ago Michael Love 41k

0

Entering edit mode

I tried this and was upgraded from 1.30.1 to 1.33.1, however the error was unchanged.

ADD REPLY • link 2.8 years ago RDoc ▴ 10

0

Entering edit mode

Ah ok, you can reverse with BiocManager::install()

Did you happen to update packages to release in this session? Sometimes restarting R can resolve.

ADD REPLY • link 2.8 years ago Michael Love 41k

0

Entering edit mode

Yes, I restarted R following the updates just in case, then I ran the script. Still the same outcome.

ADD REPLY • link 2.8 years ago RDoc ▴ 10

0

Entering edit mode

I’m out of ideas. The only times I’ve seen this were with some mixed versions or when the names actually didn’t match.

ADD REPLY • link 2.8 years ago Michael Love 41k

1

Entering edit mode

I managed to fix it now. Trying what I wrote above with changing the names didn't work, but instead writing names(dimnames(exp_mtx)) <- NULL actually fixed the problem. Sorry for wasting your time when the fix was seemingly so trivial.

Thank you very much for taking the time.

ADD REPLY • link 2.8 years ago RDoc ▴ 10