Search
Question: problems when working with SingleCellExperiment object in scater
0
gravatar for chriad
3 months ago by
chriad0
chriad0 wrote:

Hi,

I have a SingleCellExperiment object:

class: SingleCellExperiment
dim: 27998 3265
metadata(0):
assays(1): counts
rownames(27998): ENSMUSG00000051951 ENSMUSG00000089699 ... ENSMUSG00000096730 ENSMUSG00000095742
rowData names(2): id symbol
colnames: NULL
colData names(2): dataset barcode
reducedDimNames(0):
spikeNames(0):

and I would like to use the scater package for quality control.

When I try to use e.g. the calculateCPM function according to this tutorial: https://bioconductor.org/packages/devel/bioc/vignettes/scater/inst/doc/vignette.html

I get the following error:

> exprs(sce10x) <- log2(
+   calculateCPM(sce10x, use.size.factors = FALSE) + 1)
Error in colSums(counts_mat) :
  'x' must be an array of at least two dimensions

Other errors also turn up, e.g.runTSNE:

> runTSNE(object = sce10x, exprs_values = "counts")
Error in matrixStats::rowVars(exprs_mat) :
  Argument 'x' must be a matrix or a vector.

The count matrix is saved as a sparse matrix:

> class(counts(sce10x))
[1] "dgTMatrix"
attr(,"package")
[1] "Matrix"

My question now is: Can the scater package not yet handle this data structure or do I have outdated/incompatible packages installed? In the latter case, how can I know which packages I have to upgrade/downgrade? I have installed some packages with devtools::install_github and some with the useDevel (i.e. development versions of bioconductor packages). I am not experienced with managing conflicts with packages and would thus be thankful if someone could clarify.

> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Workstation release 6.9 (Santiago)

Matrix products: default
BLAS/LAPACK: /usr/prog/OpenBLAS/0.2.8-gompi-1.5.14-NX-LAPACK-3.5.0/lib/libopenblas_nehalemp-r0.2.8.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] profvis_0.3.3               purrr_0.2.3                 stringr_1.2.0               biomaRt_2.33.4             
 [5] igraph_1.1.2                Ckmeans.1d.dp_4.2.1         topGO_2.29.0                SparseM_1.77               
 [9] GO.db_3.4.1                 AnnotationDbi_1.38.1        graph_1.55.0                statmod_1.4.30             
[13] edgeR_3.19.6                limma_3.32.7                cellrangerRkit_1.1.0        Rmisc_1.5                  
[17] plyr_1.8.4                  lattice_0.20-35             bit64_0.9-7                 bit_1.1-12                 
[21] RColorBrewer_1.1-2          Matrix_1.2-11               scater_1.5.12               ggplot2_2.2.1              
[25] SingleCellExperiment_0.99.4 SummarizedExperiment_1.7.9  DelayedArray_0.3.20         matrixStats_0.52.2         
[29] Biobase_2.36.2              GenomicRanges_1.29.14       GenomeInfoDb_1.13.4         IRanges_2.11.17            
[33] S4Vectors_0.15.8            BiocGenerics_0.22.0        

loaded via a namespace (and not attached):
 [1] viridis_0.4.0           viridisLite_0.2.0       shiny_1.0.5             assertthat_0.2.0        blob_1.1.0             
 [6] GenomeInfoDbData_0.99.0 vipor_0.4.5             yaml_2.1.14             progress_1.1.2          RSQLite_2.0            
[11] glue_1.1.1              digest_0.6.12           XVector_0.17.1          colorspace_1.3-2        htmltools_0.3.6        
[16] httpuv_1.3.5            devtools_1.13.3         XML_3.98-1.9            pkgconfig_2.0.1         pheatmap_1.0.8         
[21] zlibbioc_1.22.0         xtable_1.8-2            scales_0.5.0            Rtsne_0.13              tibble_1.3.4           
[26] withr_2.0.0             lazyeval_0.2.0          magrittr_1.5            mime_0.5                memoise_1.1.0          
[31] beeswarm_0.2.3          shinydashboard_0.6.1    tools_3.4.1             data.table_1.10.4       prettyunits_1.0.2      
[36] munsell_0.4.3           locfit_1.5-9.1          irlba_2.2.1             bindrcpp_0.2            compiler_3.4.1         
[41] rlang_0.1.2             rhdf5_2.21.4            grid_3.4.1              RCurl_1.95-4.8          tximport_1.5.0         
[46] htmlwidgets_0.9         rjson_0.2.15            bitops_1.0-6            gtable_0.2.0            DBI_0.7                
[51] reshape2_1.4.2          R6_2.2.2                gridExtra_2.3           dplyr_0.7.3             bindr_0.1              
[56] stringi_1.1.5           ggbeeswarm_0.6.0        Rcpp_0.12.13    
ADD COMMENTlink modified 3 months ago by davis90 • written 3 months ago by chriad0
0
gravatar for Aaron Lun
3 months ago by
Aaron Lun18k
Cambridge, United Kingdom
Aaron Lun18k wrote:

There's no problem with your installation. The issue is that the low-level methods in matrixStats do not support sparse inputs. I thought I had caught and replaced most of these calls when I refactored scater earlier in the year, but apparently not. I will purge the remainders soon. The colSums case is probably just because scater hasn't imported the colSums method from the Matrix package; this is easily fixed.

FYI, most functions prefer to work with dgCMatrix objects, due to the more structured format of the data. I am a bit bemused about why readMM returns a dgTMatrix when all the other documentation in the Matrix package indicates a preference towards dgCMatrix objects. I guess we should also modify read10XResults to coerce the 10X input data to the dgCMatrix format automatically.

ADD COMMENTlink modified 3 months ago • written 3 months ago by Aaron Lun18k
0
gravatar for davis
3 months ago by
davis90
United Kingdom
davis90 wrote:

Thanks for the bug report! I've just commited (to Bioc devel) fixes for the `colSums` issues and adjusted `read10xResults` so that it coerces 10x data to a `dgCMatrix` automatically.

We're still working through all of the other possibilities and adding tests, so the `rowVars` issue you experienced with `runTSNE` should be resolved in the next few days too. 

ADD COMMENTlink written 3 months ago by davis90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 182 users visited in the last hour