Question: normalization for infected homogenious cell populations
gravatar for jcvera
8 weeks ago by
jcvera0 wrote:

Hi, I'm working with a mostly homogeneous population of cultured lung cells infected with influenza virus (10X Genomics Chromium single cell). After filtering, I have roughly 5K infected cells, 4K exposed but uninfected cells, and 4K unexposed cells. Basically, I'd like to know if the pre-clustering step for Scran normalization is still appropriate in my case or if this step was mostly intended for heterogeneous groups of cells (e.g. from tissue). I'm mostly interested in investigating differences in host cells response, and I'm concerned (or perhaps a little confused) about how the pre-clustering step would affect downstream analyses such as DGE using viral factors (there is heterogeneity at the virus level in how many virus genes are present/functional, etc). Any suggestions/recomendations would be greatly appreciated! Thanks, Cris

ADD COMMENTlink modified 8 weeks ago by Aaron Lun23k • written 8 weeks ago by jcvera0
Answer: normalization for infected homogenious cell populations
gravatar for Aaron Lun
8 weeks ago by
Aaron Lun23k
Cambridge, United Kingdom
Aaron Lun23k wrote:

It's fine.

The clusters are designed to mitigate the effect of DE between subpopulations when computing size factors. If you definitely have no subpopulations in your data, then the clusters won't harm or help. However, if you do have subpopulations, then the clustering will improve the accuracy of the size factor calculations.

Keep in mind that we're not just removing differences within clusters, we are also removing systematic differences between clusters. From a conceptual level, if you're trying to make A, B, C and D equal, the order in which you remove differences doesn't matter. You could make A and B equal first, then C and D, and then make A/B equal to C/D. Or you could do A and B, then make A/B equal to C, then A/B/C equal to D. In the end, everything is equal so there's no problem(*).

This is unlike other applications like imputation where the algorithm aims remove variation within some structures while preserving differences between structures. In such cases, clustering can introduce artificial structure or make weak structure look more convincing than it really is.

*: In practice, the order in which do these steps will affect the accuracy of the resulting size factors for various numerical reasons; hence the need for clustering. But provided each step is accurate, the final result should be the same regardless of the clustering (unlike other applications, which are highly sensitive to the initial clustering). You can test this by fiddling with the clustering parameters and seeing whether the size factor estimates correlate well.

ADD COMMENTlink modified 8 weeks ago • written 8 weeks ago by Aaron Lun23k

Thanks, Aaron! I ran several versions of my data with and without pre-clustering and there wasn't a huge amount of variation in the results, which is reassuring in my case. However, it's good to get confirmation from you.

ADD REPLYlink written 8 weeks ago by jcvera0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 384 users visited in the last hour