Running the DESeq2 "DESeqDataSetFromMatrix" function returns a dimname error
1
0
Entering edit mode
RDoc ▴ 10
@800f5cc3
Last seen 3.3 years ago
Sweden

I am trying to follow a vignette for finding enriched pathways. As part of that pipeline, I would like to calculate the differentially expressed genes using DESeq2. Initially, I am working with a Seurat Object, as such I run the following code to achieve what I would like to do:

exp_mtx <- seurat_exp_mtx@assays$RNA@counts # To generate the raw counts from the Seurat Object

exp_design <- as.data.frame(matrix(NA,length(colnames(exp_mtx)),1))
names(exp_design) <- c("cluster")
row.names(exp_design) <- seurat_exp_mtx@meta.data$cells. 
exp_design$cluster <- seurat_exp_mtx@meta.data$clusters # To generate the colData

dds_cortex <- DESeqDataSetFromMatrix(countData = as.matrix(exp_mtx), 
                     colData = exp_design, design = ~ cluster)

However, when trying to do so, I run into the following error:

"Error in `assays<-`(`*tmp*`, withDimnames = withDimnames, ..., value = `*vtmp*`) : please use 'assay(x, withDimnames=FALSE)) <- value' or 'assays(x, withDimnames=FALSE)) <- value' when the dimnames on the supplied assay(s) are not identical to the dimnames on RangedSummarizedExperiment object 'x'"

I checked the dimnames for both the exp_mtx and exp_design and noticed that one was named only 1 and 2, whereas the other was named cells and features and as such, I ran the following:

names(dimnames(exp_design)) <- c("cells", "cluster")

Unfortunately, it had no effect. I have checked and the CellIDs are the same. This is also to be expected considering that they were both extracted from the same initial file. I have also checked the documentation for the function to check that I've made the colData correctly, and it states "for matrix input: a DataFrame or data.frame with at least a single column. Rows of colData correspond to columns of countData" (https://www.rdocumentation.org/packages/DESeq2/versions/1.12.3/topics/DESeqDataSet-class) which is what I have.

I really have no idea about what it is that is causing the issue. Any suggestions would be greatly appreciated.

> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
 [1] grid      parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] fgsea_1.16.0                VennDiagram_1.6.20          futile.logger_1.4.3         tibble_3.1.2                dplyr_1.0.7                 DESeq2_1.30.1              
 [7] SummarizedExperiment_1.20.0 Biobase_2.50.0              MatrixGenerics_1.2.1        matrixStats_0.59.0          GenomicRanges_1.42.0        GenomeInfoDb_1.26.7        
[13] IRanges_2.24.1              S4Vectors_0.28.1            BiocGenerics_0.36.1         SeuratObject_4.0.2         

loaded via a namespace (and not attached):
 [1] httr_1.4.2             pkgload_1.2.1          bit64_4.0.5            splines_4.0.3          assertthat_0.2.1       BiocManager_1.30.16    blob_1.2.1             GenomeInfoDbData_1.2.4
 [9] yaml_2.2.1             remotes_2.4.0          sessioninfo_1.1.1      pillar_1.6.1           RSQLite_2.2.7          lattice_0.20-44        glue_1.4.2             RColorBrewer_1.1-2    
[17] XVector_0.30.0         colorspace_2.0-2       Matrix_1.3-4           XML_3.99-0.6           pkgconfig_2.0.3        devtools_2.4.2         genefilter_1.72.1      zlibbioc_1.36.0       
[25] purrr_0.3.4            xtable_1.8-4           scales_1.1.1           processx_3.5.2         BiocParallel_1.24.1    annotate_1.68.0        generics_0.1.0         ggplot2_3.3.5         
[33] usethis_2.0.1          ellipsis_0.3.2         cachem_1.0.5           withr_2.4.2            cli_3.0.1              survival_3.2-11        magrittr_2.0.1         crayon_1.4.1          
[41] memoise_2.0.0          ps_1.6.0               fs_1.5.0               fansi_0.5.0            pkgbuild_1.2.0         data.table_1.14.0      tools_4.0.3            prettyunits_1.1.1     
[49] formatR_1.11           lifecycle_1.0.0        munsell_0.5.0          locfit_1.5-9.4         DelayedArray_0.16.3    lambda.r_1.2.4         AnnotationDbi_1.52.0   callr_3.7.0           
[57] compiler_4.0.3         rlang_0.4.11           RCurl_1.98-1.3         bitops_1.0-7           testthat_3.0.4         gtable_0.3.0           DBI_1.1.1              R6_2.5.0              
[65] gridExtra_2.3          fastmap_1.1.0          bit_4.0.4              utf8_1.2.1             fastmatch_1.1-0        rprojroot_2.0.2        futile.options_1.0.1   desc_1.3.0            
[73] Rcpp_1.0.7             vctrs_0.3.8            geneplotter_1.68.0     tidyselect_1.1.1
DESeq2 Scrna scRNAseq • 2.8k views
ADD COMMENT
0
Entering edit mode

Why don't you show the heads of your counts and coldata?

ADD REPLY
0
Entering edit mode

I could do that, my dataset is quite large so it omits alot.

head(exp_design) yields:

                         cluster
10X01_1_AAACTTGACGTTAG-1      19
10X01_1_AAACTTGACTCCAC-1      19
10X01_1_AACAAACTAAGAGT-1      19
10X01_1_AACAGAGAATGTCG-1       2
10X01_1_AACAGAGACCTCGT-1      19
10X01_1_AACAGCACCCTCGT-1      19

Whereas as.matrix(exp_mtx[1:5, 1:5] yields:

         cells
features  10X01_1_AAACTTGACGTTAG-1 10X01_1_AAACTTGACTCCAC-1 10X01_1_AACAAACTAAGAGT-1 10X01_1_AACAGAGAATGTCG-1 10X01_1_AACAGAGACCTCGT-1
  Xkr4                           1                        0                        1                        2                        0
  Gm37381                        0                        0                        0                        0                        0
  Rp1                            0                        0                        0                        0                        0
  Sox17                          0                        0                        0                        0                        0
  Gm37323                        0                        0                        0                        0                        0
`

I can also add that I've tried identical(row.names(exp_design), colnames(cortex_mtx))and that returned

[1] TRUE
ADD REPLY
1
Entering edit mode

Do you have a plan for size normalization? DEseq's method is not going to be happy with so many genes containing a zero. Also, your exp_mtx doesn't look right. "features" and "cells" should not be in there. Also, sometimes R has a problem with things that start with numbers, you might try prepending an "X" to your cell IDs

ADD REPLY
0
Entering edit mode

See the 2nd, 3rd and 4th bullets here

https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#recommendations-for-single-cell-analysis

The 1st bullet concerns ZI modeling which is not always needed, I think with UMI it’s not often performed now.

ADD REPLY
0
Entering edit mode

Thanks for pointing out that there is something wrong with the appearance of the exp_mtx, it seemed to be the root cause of my issue. I'm well aware of the problem with the geometric means. In my case, I changed the sftype to 'poscount' to test it. I'm mainly interested in getting everything to work in an exploratory fashion so it's acceptable if everything isn't optimized from the start.

ADD REPLY
0
Entering edit mode
@mikelove
Last seen 4 hours ago
United States

I think this happens with a version mix of Bioc packages.

Try BiocManager::valid()

ADD COMMENT
0
Entering edit mode

I have tried that as well, it only returned "TRUE":

BiocManager::valid()
'getOption("repos")' replaces Bioconductor standard repositories, see '?repositories' for details

replacement repositories:
    CRAN: https://cran.rstudio.com/

[1] TRUE
ADD REPLY
0
Entering edit mode

Can you test this:

devtools::install_github("mikelove/DESeq2")

this will mix up your versions, but I want to see if the latest version resolves something.

ADD REPLY
0
Entering edit mode

I tried this and was upgraded from 1.30.1 to 1.33.1, however the error was unchanged.

ADD REPLY
0
Entering edit mode

Ah ok, you can reverse with BiocManager::install()

Did you happen to update packages to release in this session? Sometimes restarting R can resolve.

ADD REPLY
0
Entering edit mode

Yes, I restarted R following the updates just in case, then I ran the script. Still the same outcome.

ADD REPLY
0
Entering edit mode

I’m out of ideas. The only times I’ve seen this were with some mixed versions or when the names actually didn’t match.

ADD REPLY
1
Entering edit mode

I managed to fix it now. Trying what I wrote above with changing the names didn't work, but instead writing names(dimnames(exp_mtx)) <- NULL actually fixed the problem. Sorry for wasting your time when the fix was seemingly so trivial.

Thank you very much for taking the time.

ADD REPLY

Login before adding your answer.

Traffic: 448 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6