What is the most useful distance measure to use for cell clustering based on MAST deviance residuals?
2
0
Entering edit mode
@alexanderaivazidis-14116
Last seen 6.8 years ago

Hi,

I have used MAST to obtain deviance residuals after accounting for the count detection rate in my cell samples and I found this to improve clustering and subtype identification. But do you have any opinion which distance measure is the best to use between different cells, when I cluster based on deviance residuals, rather than log2 transformed expression data?

I am asking because I do not get best results with cosine distance, which is the usual distance measure I choose. L1 norm, L2 norm, L3 norm all seem to work better, but I am not sure which one is the best and I could not think of a theoretical justification for either one yet, based on the fact that the underlying data are deviance residuals and not log2 expression data.

Thanks for your advice!

Alexander

MAST • 1.6k views
ADD COMMENT
0
Entering edit mode
@andrew_mcdavid-11488
Last seen 6 weeks ago
United States

The L2-norm makes some sense to me, since the residuals may be approximately N(0,1) distributed under a null distribution of exchangeable cells, given the covariates included in the model. The clustering in this case could seen as searching for latent Gaussian structure among the residuals?  The optimal metric probably depends on the type of structure present, so I doubt there's any thing that can be said in general about this.

ADD COMMENT
0
Entering edit mode
@alexanderaivazidis-14116
Last seen 6.8 years ago

Thanks! That makes sense. The major cell types are already identified in the data set and I have included them as covariates when I fit the model. I then look for sub types within those major cell types, based on the deviance residuals. So it is quite true the remaining deviance residuals indeed look approximately N(0,1) distributed. 

 

ADD COMMENT

Login before adding your answer.

Traffic: 766 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6