5 months ago by
Cambridge, United Kingdom
There are a number of aspects of your post that need addressing, so let's do it one at a time.
The first is the switch from
SingleCellExperiment. This happened a while ago, motivated by the superiority of the
SummarizedExperiment class as a general data container in terms of stability, flexibility and usability. From a user perspective, this simply involves changing the constructor call (from
SingleCellExperiment()), and the various accessors (e.g.,
colData()). Not particularly hard, and it also allows you to interface with any
SummarizedExperiment-compatible packages, e.g., iSEE, DESeq2.
As for TMM normalization - we've known for a while that this was a poor choice of normalization method for single-cell RNA-seq data with lots of zeroes, see https://doi.org/10.1186/s13059-016-0947-7 for a study of this. (Similar criticisms apply to DESeq's default normalization.) Thus, we no longer recommend using TMM normalization and have removed all functions that do so. I would suggest using alternatives like
scran:::computeSumFactors(), see the simpleSingleCell workflow to see how it's done. That said, if you insist on using TMM, you can simply call
edgeR::calcNormFactors directly on your count matrix and multiply the result by the library sizes to get the "TMM size factors". The multiplication is important as
calcNormFactors alone will only yield the normalization factors, these need to be scaled by the library sizes to obtain the size factors (yes, there's a difference between these two terms!).
The situation of
normalizeExprs is a bit more complicated because it tries to do three things at once - TMM normalization, log-transformation and batch correction. I didn't write this function, but I hated it. It doesn't have a single purpose, it's just cobbled together from three separate functions that might as well be called separately. Separate calls would require a bit more writing, but at least the user (and reader of the code) understands what is happening. A reader seeing a call to
normalizeExprs() would find it hard to figure out the function does. If we had to use a single function, it should instead be called:
... which we can all agree is a stupid name. I deprecated
normalizeExprs() because it was better for users to be explicit about what they wanted to do and call the relevant functions directly.
modified 5 months ago
5 months ago by
Aaron Lun • 23k