rowSums() rowMeans() don't work on sparse matrix "dgCMatrix"
1
1
Entering edit mode
@mengchun-tseng-14141
Last seen 5.7 years ago
USA

Hi all,

I am recently analyzing a 10X singlecell RNA-seq data following the workflow posted on: https://master.bioconductor.org/packages/release/workflows/vignettes/simpleSingleCell/inst/doc/work-3-tenx.html

I got an error when using  makeTechTrend() and I have figured out that the problem is that this function calls rowMeans() internally but for some reason the function doesn't work for sparse matrix anymore. However, when I try to run makeTechTrend() using PBMC 4K data, it works with no problem although rowMeans() still not working when I pull the sparse matrix "counts(sce)" outside the function. I've made some reproducible codes to demonstrate the problem:

#download the PBMC file
download.file("http://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc4k/pbmc4k_raw_gene_bc_matrices.tar.gz","pbmc4k_raw_gene_bc_matrices.tar.gz")
untar("pbmc4k_raw_gene_bc_matrices.tar.gz", exdir="pbmc4k")

#make sce object
library(DropletUtils)
fname <- "pbmc4k/raw_gene_bc_matrices/GRCh38"
sce <- read10xCounts(fname, col.names=TRUE)
sce

# class: SingleCellExperiment 
# dim: 33694 737280 
# metadata(0):
#   assays(1): counts
# rownames(33694): ENSG00000243485 ENSG00000237613 ... ENSG00000277475 ENSG00000268674
# rowData names(2): ID Symbol
# colnames(737280): AAACCTGAGAAACCAT-1 AAACCTGAGAAACCGC-1 ... TTTGTCATCTTTAGTC-1 TTTGTCATCTTTCCTC-1
# colData names(2): Sample Barcode
# reducedDimNames(0):
#   spikeNames(0):

class(counts(sce))
# [1] "dgCMatrix"
# attr(,"package")
# [1] "Matrix"

methods(class=class(counts(sce)))
#is has colMeans    colSums rowMeans    rowSums in it.

#gets error in this step
colSums(counts(sce))[1:5]
# Error in colSums(counts(sce)) : 
#   'x' must be an array of at least two dimensions

Can anyone give me some hints on it? Thanks in advance!

Meng-Chun

sessionInfo()

R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    
attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
 [1] scran_1.8.0                 EnsDb.Hsapiens.v86_2.99.0   ensembldb_2.4.0             AnnotationFilter_1.4.0     
 [5] GenomicFeatures_1.32.0      AnnotationDbi_1.42.0        scater_1.8.0                ggplot2_2.2.1              
 [9] DropletUtils_1.0.0          SingleCellExperiment_1.2.0  SummarizedExperiment_1.10.0 DelayedArray_0.6.0         
[13] matrixStats_0.53.1          Biobase_2.40.0              GenomicRanges_1.32.0        GenomeInfoDb_1.16.0        
[17] IRanges_2.14.0              S4Vectors_0.18.0            BiocGenerics_0.26.0         BiocParallel_1.13.1        
loaded via a namespace (and not attached):
 [1] ProtGenerics_1.12.0      bitops_1.0-6             bit64_0.9-7              progress_1.1.2          
 [5] httr_1.3.1               dynamicTreeCut_1.63-1    tools_3.5.0              irlba_2.3.2             
 [9] R6_2.2.2                 DT_0.4                   vipor_0.4.5              DBI_1.0.0               
[13] lazyeval_0.2.1           colorspace_1.3-2         gridExtra_2.3            prettyunits_1.0.2       
[17] bit_1.1-12               curl_3.2                 compiler_3.5.0           rtracklayer_1.40.0      
[21] scales_0.5.0             stringr_1.3.0            digest_0.6.15            Rsamtools_1.32.0        
[25] XVector_0.20.0           pkgconfig_2.0.1          htmltools_0.3.6          limma_3.36.0            
[29] htmlwidgets_1.2          rlang_0.2.0              RSQLite_2.1.0            FNN_1.1                 
[33] shiny_1.0.5              DelayedMatrixStats_1.2.0 bindr_0.1.1              dplyr_0.7.4             
[37] RCurl_1.95-4.10          magrittr_1.5             GenomeInfoDbData_1.1.0   Matrix_1.2-14           
[41] Rcpp_0.12.16             ggbeeswarm_0.6.0         munsell_0.4.3            Rhdf5lib_1.2.0          
[45] viridis_0.5.1            stringi_1.1.7            yaml_2.1.19              edgeR_3.22.0            
[49] zlibbioc_1.26.0          rhdf5_2.24.0             plyr_1.8.4               grid_3.5.0              
[53] blob_1.1.1               promises_1.0.1           shinydashboard_0.7.0     lattice_0.20-35         
[57] Biostrings_2.48.0        locfit_1.5-9.1           pillar_1.2.2             igraph_1.2.1            
[61] rjson_0.2.15             reshape2_1.4.3           biomaRt_2.36.0           XML_3.98-1.11           
[65] glue_1.2.0               data.table_1.11.0        httpuv_1.4.1             gtable_0.2.0            
[69] assertthat_0.2.0         mime_0.5                 xtable_1.8-2             later_0.7.1             
[73] viridisLite_0.3.0        tibble_1.4.2             GenomicAlignments_1.16.0 beeswarm_0.2.3          
[77] memoise_1.1.0            tximport_1.8.0           bindrcpp_0.2.2           statmod_1.4.30          
> 
scater scran single-cell • 7.1k views
ADD COMMENT
2
Entering edit mode
@martin-morgan-1513
Last seen 2 days ago
United States

Use Matrix::rowSums() to be sure to get the generic for dgCMatrix.

ADD COMMENT
0
Entering edit mode

Good call. I ran into the same issue, and after trying `base::rowSums()` with no success, was left clueless. I wonder if perhaps Bioconductor should be updated so-as to better detect sparse matrices and call the appropriate function? The current error message is not very informative.

ADD REPLY
0
Entering edit mode

This is under active consideration:

https://github.com/Bioconductor/MatrixGenerics/issues/2

It is not straightforward as it requires coordination with Matrix and associated packages.

ADD REPLY

Login before adding your answer.

Traffic: 551 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6