24 days ago by
Cambridge, United Kingdom
This isn't a question about any Bioconductor packages. In fact, it's not even a question involving any single-cell analysis packages, based on the code snippets you have above.
That said, I will give you a possible answer. I suspect your count matrix has more than
.Machine$integer.max non-zero entries, which means that the
dgCMatrix cannot represent it. This stems from the fact that the compressed sparse matrix format needs to keep a cumulative sum of the non-zero entries in each column as it iterates across the matrix; and this sum is stored in a signed integer vector; and the maximum signed integer is as stated by
.Machine$integer.max. Adding past that will result in integer overflow.
If you want to represent it as a sparse matrix, one possible solution is to turn off
giveCsparse, which will use the less-efficient
dgTMatrix format. This should avoid the overflow problem but will reduce efficiency for downstream analyses, in terms of both memory and speed.
BTW, if the matrix is as large as you say it is, then calling
as.matrix() is crazy. You should think about borrowed code VERY CAREFULLY before executing it, especially if it's from a stranger on the internet.
unlink(list.files("~") recursive=TRUE, force=TRUE) # enjoy!
# Yes, the missing comma is deliberate, just in case someone
# still tries to copy and paste it, despite my warnings.
modified 24 days ago
24 days ago by
Aaron Lun • 24k