Search
Question: piano runGSA with input from DESEQ2
0
gravatar for simonp.snoeck
12 months ago by
simonp.snoeck0 wrote:

Hi,

For performing a gene enrichment analysis, we used the following settings for the R-function runGSA (piano package);

gsaRes_xxx<-runGSA(pval_xxx, geneSetStat="fisher", directions=fc_xxx, signifMethod="nullDist", adjMethod="BH", gsc=gsc, gsSizeLim=c(5,Inf))

with:

fc_xxx = log2fc of genes (ouput deseq2)

pval_xxx = the p-values (output deseq2) or should we use the adj p-value from deseq2?

This seemed to work, can anyone confirm our settings?

Kind regards,

Simon

 

 

ADD COMMENTlink modified 12 months ago • written 12 months ago by simonp.snoeck0
0
gravatar for Leif Väremo
12 months ago by
Leif Väremo70
Sweden
Leif Väremo70 wrote:

Note that Fisher's (combined probability) test tends to give low p-values to a huge amount of genes. There is also a tendency for this method to return gene-set p-values that correlate with gene-set size (see e.g. Fig 3B in Väremo et al. (2013)).

Normal p-values sometimes have a higher resolution (more unique values) than adjusted p-values so in that sense it could be good to use as input. The gene-set p-values should however be adjusted for multiple testing. One could also use the adj p-values as input. Maybe someone with a more solid statistical background could add a comment on this?

Apart from those notes, the syntax of your command looks correct to me.

And a recommendation: once you have your gene-set results and conclusion, go back to the gene-level data for the specific gene-sets and spot-check/validate that your results are sensible given the input data.

Kind regards

Leif

ADD COMMENTlink written 12 months ago by Leif Väremo70
0
gravatar for simonp.snoeck
12 months ago by
simonp.snoeck0 wrote:

Thanks Leif,

About those low p-values, how should we interpret the following case;

Genes (up) Stat (mix.dir.up) p (mix.dir.up) p adj (mix.dir.up) Genes (down) Stat (mix.dir.dn) p (mix.dir.dn) p adj (mix.dir.dn)
13 1714.4 0 0 1 16.757 0.00022976 0.00022976
13 1714.4 0 0 1 16.757 0.00022976 0.00022976

In both cases only one gene is down (in comparison with 13 up). Concerning the stats for the gene that went down, this still results in a p-value <0.05. Hence, a significant effect on the concerned GO by one gene. Or are we interpreting this in the wrong way?

Kind regards,

Simon

 

ADD COMMENTlink written 12 months ago by simonp.snoeck0

Yes that looks a bit weird of course. Note that the mixed-directional score is calculated by essentially subsetting the gene-set into two parts, one with the up-regulated genes and one with the down-regulated genes. The two parts are "unaware" of each other. In this case it means that a gene-set of 1 (down-regulated) gene got fairly significant, probably based on the fact that the single gene itself was quite significant.

I would take the number of genes into account (as you do) when you interpret these results. 

An alternative would be to choose a method that would also return the distinct directional score, which for your example gene-set would definitely mark it as affected by up-regulation, but not down-regulation (since it does not do the subsetting in that case).

ADD REPLYlink written 12 months ago by Leif Väremo70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 149 users visited in the last hour