Search
Question: problems when working with SingleCellExperiment object in scater
0
gravatar for chriad
7 weeks ago by
chriad0
chriad0 wrote:

Hi,

I have a SingleCellExperiment object:

class: SingleCellExperiment
dim: 27998 3265
metadata(0):
assays(1): counts
rownames(27998): ENSMUSG00000051951 ENSMUSG00000089699 ... ENSMUSG00000096730 ENSMUSG00000095742
rowData names(2): id symbol
colnames: NULL
colData names(2): dataset barcode
reducedDimNames(0):
spikeNames(0):

and I would like to use the scater package for quality control.

When I try to use e.g. the calculateCPM function according to this tutorial: https://bioconductor.org/packages/devel/bioc/vignettes/scater/inst/doc/vignette.html

I get the following error:

> exprs(sce10x) <- log2(
+   calculateCPM(sce10x, use.size.factors = FALSE) + 1)
Error in colSums(counts_mat) :
  'x' must be an array of at least two dimensions

Other errors also turn up, e.g.runTSNE:

> runTSNE(object = sce10x, exprs_values = "counts")
Error in matrixStats::rowVars(exprs_mat) :
  Argument 'x' must be a matrix or a vector.

The count matrix is saved as a sparse matrix:

> class(counts(sce10x))
[1] "dgTMatrix"
attr(,"package")
[1] "Matrix"

My question now is: Can the scater package not yet handle this data structure or do I have outdated/incompatible packages installed? In the latter case, how can I know which packages I have to upgrade/downgrade? I have installed some packages with devtools::install_github and some with the useDevel (i.e. development versions of bioconductor packages). I am not experienced with managing conflicts with packages and would thus be thankful if someone could clarify.

> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Workstation release 6.9 (Santiago)

Matrix products: default
BLAS/LAPACK: /usr/prog/OpenBLAS/0.2.8-gompi-1.5.14-NX-LAPACK-3.5.0/lib/libopenblas_nehalemp-r0.2.8.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] profvis_0.3.3               purrr_0.2.3                 stringr_1.2.0               biomaRt_2.33.4             
 [5] igraph_1.1.2                Ckmeans.1d.dp_4.2.1         topGO_2.29.0                SparseM_1.77               
 [9] GO.db_3.4.1                 AnnotationDbi_1.38.1        graph_1.55.0                statmod_1.4.30             
[13] edgeR_3.19.6                limma_3.32.7                cellrangerRkit_1.1.0        Rmisc_1.5                  
[17] plyr_1.8.4                  lattice_0.20-35             bit64_0.9-7                 bit_1.1-12                 
[21] RColorBrewer_1.1-2          Matrix_1.2-11               scater_1.5.12               ggplot2_2.2.1              
[25] SingleCellExperiment_0.99.4 SummarizedExperiment_1.7.9  DelayedArray_0.3.20         matrixStats_0.52.2         
[29] Biobase_2.36.2              GenomicRanges_1.29.14       GenomeInfoDb_1.13.4         IRanges_2.11.17            
[33] S4Vectors_0.15.8            BiocGenerics_0.22.0        

loaded via a namespace (and not attached):
 [1] viridis_0.4.0           viridisLite_0.2.0       shiny_1.0.5             assertthat_0.2.0        blob_1.1.0             
 [6] GenomeInfoDbData_0.99.0 vipor_0.4.5             yaml_2.1.14             progress_1.1.2          RSQLite_2.0            
[11] glue_1.1.1              digest_0.6.12           XVector_0.17.1          colorspace_1.3-2        htmltools_0.3.6        
[16] httpuv_1.3.5            devtools_1.13.3         XML_3.98-1.9            pkgconfig_2.0.1         pheatmap_1.0.8         
[21] zlibbioc_1.22.0         xtable_1.8-2            scales_0.5.0            Rtsne_0.13              tibble_1.3.4           
[26] withr_2.0.0             lazyeval_0.2.0          magrittr_1.5            mime_0.5                memoise_1.1.0          
[31] beeswarm_0.2.3          shinydashboard_0.6.1    tools_3.4.1             data.table_1.10.4       prettyunits_1.0.2      
[36] munsell_0.4.3           locfit_1.5-9.1          irlba_2.2.1             bindrcpp_0.2            compiler_3.4.1         
[41] rlang_0.1.2             rhdf5_2.21.4            grid_3.4.1              RCurl_1.95-4.8          tximport_1.5.0         
[46] htmlwidgets_0.9         rjson_0.2.15            bitops_1.0-6            gtable_0.2.0            DBI_0.7                
[51] reshape2_1.4.2          R6_2.2.2                gridExtra_2.3           dplyr_0.7.3             bindr_0.1              
[56] stringi_1.1.5           ggbeeswarm_0.6.0        Rcpp_0.12.13    
ADD COMMENTlink modified 7 weeks ago by davis90 • written 7 weeks ago by chriad0
0
gravatar for Aaron Lun
7 weeks ago by
Aaron Lun17k
Cambridge, United Kingdom
Aaron Lun17k wrote:

There's no problem with your installation. The issue is that the low-level methods in matrixStats do not support sparse inputs. I thought I had caught and replaced most of these calls when I refactored scater earlier in the year, but apparently not. I will purge the remainders soon. The colSums case is probably just because scater hasn't imported the colSums method from the Matrix package; this is easily fixed.

FYI, most functions prefer to work with dgCMatrix objects, due to the more structured format of the data. I am a bit bemused about why readMM returns a dgTMatrix when all the other documentation in the Matrix package indicates a preference towards dgCMatrix objects. I guess we should also modify read10XResults to coerce the 10X input data to the dgCMatrix format automatically.

ADD COMMENTlink modified 7 weeks ago • written 7 weeks ago by Aaron Lun17k
0
gravatar for davis
7 weeks ago by
davis90
United Kingdom
davis90 wrote:

Thanks for the bug report! I've just commited (to Bioc devel) fixes for the `colSums` issues and adjusted `read10xResults` so that it coerces 10x data to a `dgCMatrix` automatically.

We're still working through all of the other possibilities and adding tests, so the `rowVars` issue you experienced with `runTSNE` should be resolved in the next few days too. 

ADD COMMENTlink written 7 weeks ago by davis90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 139 users visited in the last hour