Search
Question: Interpretation of cngeneson when doing differential expression analysis in MAST
0
gravatar for francesco.brundu@gmail.com
6 months ago by

Hi all,

I am running MAST for Single-Cell differential gene expression analysis. I followed the vignette on https://github.com/RGLab/MAST/blob/master/vignettes/MAITAnalysis.Rmd .

The code I'm using is the following:

sca <- FromMatrix(as.matrix(df), cData = cData, fData = fData)
cdr2 <-colSums(assay(sca)>0)
colData(sca)$cngeneson <- scale(cdr2)
cond <- factor(colData(sca)$type)
# used type2 as reference level
cond <- relevel(cond, 'type2')
colData(sca)$type<-cond
zlmCond <- zlm(~ type + cngeneson, sca, parallel = TRUE)
summary <- summary(zlmCond, doLRT='type1')
print(summary, n=4)

The result I got is:

Fitted zlm with top 4 genes per contrast:
( log fold change Z-score )
 primerid type1    cngeneson
 Gene1      63.1*    0.3   
 Gene2      70.7*    5.8   
 Gene3     -23.9    87.8*  
 Gene4     -30.1    87.2*  
 Gene5     -17.1    96.8*  
 Gene6     -20.9    93.6*  
 Gene7      64.8*   10.5   
 Gene8      65.0*    9.2   

If I understood correctly, type1 cells are differentially upregulated in Gene{1,2,7,8}. The * should represent significance (p < 0.01). However, how to interpret the cngeneson differentially expressed genes? If I recall correctly, cngeneson is the number of genes detected in each cell. But I am not able to understand which additional insight can provide this contrast.

Thanks,

Francesco

ADD COMMENTlink modified 6 months ago by Andrew_McDavid150 • written 6 months ago by francesco.brundu@gmail.com40
1
gravatar for Andrew_McDavid
6 months ago by
Andrew_McDavid150 wrote:

If I understood correctly, type1 cells are differentially upregulated in Gene{1,2,7,8}. The * should represent significance (p < 0.01).

For both the type and cngeneson covaraites as the message at the top of the output states, the top 4 genes by Z score are showh. The * indicates which contrast the gene is in the top 4 list. All are extremely significant, much lower than P<.01.

However, how to interpret the cngeneson differentially expressed genes? If I recall correctly, cngeneson is the number of genes detected in each cell. But I am not able to understand which additional insight can provide this contrast.

This generally isn't of direct interest.
In any case, the print method is mainly there to provide a way to check that you coded your covariates correctly and give you a quick look at the signal in your data.
Use the summary$datatable for any downstream analysis.

ADD COMMENTlink written 6 months ago by Andrew_McDavid150

Thanks Andrew. If I only want the first 10 genes per contrast, I can safely assume to take directly the output of print, right? Or is there any caveat?

ADD REPLYlink written 6 months ago by francesco.brundu@gmail.com40
1
Well, if all you care about is the top 10 genes, sure. But you probably will want to know p values and effect sizes, too, which are all in the `datatable.` You can get the top 10 genes by contrast by `order`ing it by contrast and then p value.
ADD REPLYlink modified 6 months ago • written 6 months ago by Andrew_McDavid150

Thanks. I was asking because ordering by 'coef' of logFC gives me a set of DE genes with minimal overlap with the genes printed by summary (considering 10 DE genes). It is surely because one ordering is done using z score (print) and the other using effect size (logFC coef). I didn't fully understand which one I should use (I'd go for the coef but it is unclear to me why z score is displayed instead in the summary), can you explain this?

ADD REPLYlink modified 6 months ago • written 6 months ago by francesco.brundu@gmail.com40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 355 users visited in the last hour