problems when working with SingleCellExperiment object in scater
Entering edit mode
chriad ▴ 10
Last seen 4.6 years ago


I have a SingleCellExperiment object:

class: SingleCellExperiment
dim: 27998 3265
assays(1): counts
rownames(27998): ENSMUSG00000051951 ENSMUSG00000089699 ... ENSMUSG00000096730 ENSMUSG00000095742
rowData names(2): id symbol
colnames: NULL
colData names(2): dataset barcode

and I would like to use the scater package for quality control.

When I try to use e.g. the calculateCPM function according to this tutorial:

I get the following error:

> exprs(sce10x) <- log2(
+   calculateCPM(sce10x, use.size.factors = FALSE) + 1)
Error in colSums(counts_mat) :
  'x' must be an array of at least two dimensions

Other errors also turn up, e.g.runTSNE:

> runTSNE(object = sce10x, exprs_values = "counts")
Error in matrixStats::rowVars(exprs_mat) :
  Argument 'x' must be a matrix or a vector.

The count matrix is saved as a sparse matrix:

> class(counts(sce10x))
[1] "dgTMatrix"
[1] "Matrix"

My question now is: Can the scater package not yet handle this data structure or do I have outdated/incompatible packages installed? In the latter case, how can I know which packages I have to upgrade/downgrade? I have installed some packages with devtools::install_github and some with the useDevel (i.e. development versions of bioconductor packages). I am not experienced with managing conflicts with packages and would thus be thankful if someone could clarify.

> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Workstation release 6.9 (Santiago)

Matrix products: default
BLAS/LAPACK: /usr/prog/OpenBLAS/0.2.8-gompi-1.5.14-NX-LAPACK-3.5.0/lib/

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] profvis_0.3.3               purrr_0.2.3                 stringr_1.2.0               biomaRt_2.33.4             
 [5] igraph_1.1.2                Ckmeans.1d.dp_4.2.1         topGO_2.29.0                SparseM_1.77               
 [9] GO.db_3.4.1                 AnnotationDbi_1.38.1        graph_1.55.0                statmod_1.4.30             
[13] edgeR_3.19.6                limma_3.32.7                cellrangerRkit_1.1.0        Rmisc_1.5                  
[17] plyr_1.8.4                  lattice_0.20-35             bit64_0.9-7                 bit_1.1-12                 
[21] RColorBrewer_1.1-2          Matrix_1.2-11               scater_1.5.12               ggplot2_2.2.1              
[25] SingleCellExperiment_0.99.4 SummarizedExperiment_1.7.9  DelayedArray_0.3.20         matrixStats_0.52.2         
[29] Biobase_2.36.2              GenomicRanges_1.29.14       GenomeInfoDb_1.13.4         IRanges_2.11.17            
[33] S4Vectors_0.15.8            BiocGenerics_0.22.0        

loaded via a namespace (and not attached):
 [1] viridis_0.4.0           viridisLite_0.2.0       shiny_1.0.5             assertthat_0.2.0        blob_1.1.0             
 [6] GenomeInfoDbData_0.99.0 vipor_0.4.5             yaml_2.1.14             progress_1.1.2          RSQLite_2.0            
[11] glue_1.1.1              digest_0.6.12           XVector_0.17.1          colorspace_1.3-2        htmltools_0.3.6        
[16] httpuv_1.3.5            devtools_1.13.3         XML_3.98-1.9            pkgconfig_2.0.1         pheatmap_1.0.8         
[21] zlibbioc_1.22.0         xtable_1.8-2            scales_0.5.0            Rtsne_0.13              tibble_1.3.4           
[26] withr_2.0.0             lazyeval_0.2.0          magrittr_1.5            mime_0.5                memoise_1.1.0          
[31] beeswarm_0.2.3          shinydashboard_0.6.1    tools_3.4.1             data.table_1.10.4       prettyunits_1.0.2      
[36] munsell_0.4.3           locfit_1.5-9.1          irlba_2.2.1             bindrcpp_0.2            compiler_3.4.1         
[41] rlang_0.1.2             rhdf5_2.21.4            grid_3.4.1              RCurl_1.95-4.8          tximport_1.5.0         
[46] htmlwidgets_0.9         rjson_0.2.15            bitops_1.0-6            gtable_0.2.0            DBI_0.7                
[51] reshape2_1.4.2          R6_2.2.2                gridExtra_2.3           dplyr_0.7.3             bindr_0.1              
[56] stringi_1.1.5           ggbeeswarm_0.6.0        Rcpp_0.12.13    
scater SingleCellExperiment • 2.5k views
Entering edit mode
Aaron Lun ★ 27k
Last seen 12 hours ago
The city by the bay

There's no problem with your installation. The issue is that the low-level methods in matrixStats do not support sparse inputs. I thought I had caught and replaced most of these calls when I refactored scater earlier in the year, but apparently not. I will purge the remainders soon. The colSums case is probably just because scater hasn't imported the colSums method from the Matrix package; this is easily fixed.

FYI, most functions prefer to work with dgCMatrix objects, due to the more structured format of the data. I am a bit bemused about why readMM returns a dgTMatrix when all the other documentation in the Matrix package indicates a preference towards dgCMatrix objects. I guess we should also modify read10XResults to coerce the 10X input data to the dgCMatrix format automatically.

Entering edit mode
davis ▴ 90
Last seen 4.6 years ago
United Kingdom

Thanks for the bug report! I've just commited (to Bioc devel) fixes for the `colSums` issues and adjusted `read10xResults` so that it coerces 10x data to a `dgCMatrix` automatically.

We're still working through all of the other possibilities and adding tests, so the `rowVars` issue you experienced with `runTSNE` should be resolved in the next few days too. 


Login before adding your answer.

Traffic: 555 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6