Question: edgeR: Strange volcano plot
0
gravatar for cronanz
11 weeks ago by
cronanz0
cronanz0 wrote:

Dear all,

I have a scRNA-seq data (plate-based) and to identify differentially expressed genes between clusters, I have made use of edgeR. The input data was expected counts from RSEM and the example workflow is as follows:

all_edger <- DGEList(counts=all_expc,group=groups)
all_edger <- calcNormFactors(all_edger,method="TMMwzp")
all_design <- model.matrix(~0+groups)
all_edger <- estimateDisp(all_edger,design=all_design)
all_fit <- glmFit(all_edger,all_design)
all_lrt <- glmLRT(all_fit,constrast=c(-1,0,0,0,0,1,0,0))

The resulting volcano plot from the above comparison has a pattern that I'm not familiar with. Supposedly there is a tight correlation between logFC and -log10(FDR) for certain genes that resulted in a line of genes from each side of the plot. I guess my understanding is limited such that I'm unable to interpret this pattern. Is this to be expected? Am I doing something out of norm that results in this? Thank you very much.

Volcano plot: https://ibb.co/JcMnK7r

edger dgea scrna-seq • 204 views
ADD COMMENTlink modified 11 weeks ago by Aaron Lun24k • written 11 weeks ago by cronanz0
1

Generally one plots the negative log10 of the nominal p-value

ADD REPLYlink written 11 weeks ago by Kevin Blighe170
1

Here are a couple of posts explaining why -log10(p) is better than -log10(FDR) for the volcano plot (as noted by Kevin Blighe):

ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by Gordon Smyth37k
Answer: edgeR: Strange volcano plot
1
gravatar for Aaron Lun
11 weeks ago by
Aaron Lun24k
Cambridge, United Kingdom
Aaron Lun24k wrote:

It's hard to say for sure, but I would guess that you have a few genes that are all-zero in one group and with some non-zero counts in the other group. If you hold the dispersion constant (e.g., if all of the genes have very similar abundances), the p-value will be a monotonic function of the log-fold change, resulting in the lines that you've observed. It may even be that the non-zero counts in each group come from the same cells - or even just a single cell - which contributes to the clear definition of the pattern on the volcano plot.

I would suggest having a closer look at a few of those genes (in terms of their expression profiles across groups, e.g., with scater::plotExpression) for further diagnostics. Such patterns are not necessarily a problem - the counts are low, after all - though you are correct in that they do warrant some level of concern and investigation.

ADD COMMENTlink modified 11 weeks ago • written 11 weeks ago by Aaron Lun24k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 134 users visited in the last hour