Question: Normalizing vs RNA content in scRNAseq data
gravatar for ankur.chakravarthy.10
12 months ago by
United Kingdom
ankur.chakravarthy.1010 wrote:

Hi there, 

I'm basically dealing with two scenarios here - one is using Census from Monocle2 to convert scRNAseq data with no spike-ins or spike-in normalisation using scran. My downstream workflow uses Seurat to carry out clustering, population discovery et cetera so I want to know if the best thing to use for downstream processing is the normalised values *not* corrected for total RNA content in the cell , or whether to, particularly in the case of Census estimates, divide by the total estimated content of the mRNA in question first (as opposed to dividing read counts by total number of reads or using scran estimates using deconvolution sum factors).


ADD COMMENTlink modified 12 months ago by Aaron Lun20k • written 12 months ago by ankur.chakravarthy.1010
gravatar for Aaron Lun
12 months ago by
Aaron Lun20k
Cambridge, United Kingdom
Aaron Lun20k wrote:

Your question boils down to "should I preserve the effects of total RNA content or not". If total RNA content is of interest to you, then you should use spike-in normalization, as this will not normalize out changes in content. If not, then you should use methods based on the assumption of a non-DE majority of genes, such as the deconvolution method in scran.

The choice is ultimately dependent on your biological question and system, but as a rule of thumb; can you easily relate changes in total RNA content to biological function or causes? For example, T cells get bigger when they get activated, so there's a clear cause/effect between total RNA content and the biology. In this case, I might want to preserve the total RNA content as its biological interpretation is obvious.

In contrast, if I have two distinct cell types (with no information about their lineage), the nature of any differences in total RNA content between those two types is less clear. Of greater interest are the identity of genes that are upregulated or downregulated in each cell type, conditional on the changes in total RNA content. In other words, we want to remove changes in total RNA content here, to offset differences in the overall transcriptional activity of each cell type when determining if a gene is turned on or off. In such cases, I would be inclined to use non-DE methods.

ADD COMMENTlink modified 12 months ago • written 12 months ago by Aaron Lun20k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 129 users visited in the last hour