Search
Question: EGSEA: how does the program generate the value for the up or down regulation of gene sets or pathways
0
24 months ago by
yingchen0
yingchen0 wrote:

Hi guys,

I tried the EGSEA package and was hoping to find a better way to do the GSEA. The output column Direction under gsa$mylabel$test.results$mycontrast gives values 1,0, -1. I assume that -1 means the gene set/pathway is down-regulated and 1 means it's up-regulated. My problem is that most of the gene sets/pathways in my test study are -1, and thus down-regulated, which is not consistent with the results I got with other programs such as GAGE (which is one test in EGSEA). I read the pre-print and found no clue. Any suggestion? Thanks a lot, Ying ADD COMMENTlink modified 24 months ago by Monther Alhamdoosh40 • written 24 months ago by yingchen0 0 24 months ago by Australia/Melbourne/CSL Limited Monther Alhamdoosh40 wrote: Hi Ying, Thanks for trying out our package! The Direction column in EGSEA is calculated based on the logFC values that are calculated using limma::topTable (if it was not provided). We simply count the number of genes that are up- and down-regulated in the gene set and make a decision based on the direction of the majority. Note that the argument "logFC.cutoff" is used in this calculation, which is 0 by default. To closely see the logFC values that are used in the calculation, click on the "Interpret Results" link in the EGSEA report and look into the CSV files of the gene sets of interest. Hope this helps. Best, Monther ADD COMMENTlink written 24 months ago by Monther Alhamdoosh40 Hi Monther, Thanks a lot for the explanation! I have another question, what is the recommended cut-off for significant DE gene sets and pathways? When I used GAGE for the same data set, in 1 contrast, there are only 13 kegg pathways with q.val <= 0.01. When I did the analysis with EGSEA, the q.val are much more smaller ~ 10-E7 and DE pathways are way more than 13. Another thing, is it possible to integrate GOexpress into you ensemble? Thanks a lot, Ying ADD REPLYlink written 24 months ago by yingchen0 Hi Ying, No worries! That's right. EGSEA produces p-values that are much smaller than individual methods, particularly for gene sets that are significant in the majority of base methods. Another factor that affects the scale of EGSEA p-values is the individuals methods that are used as some of them produce very small p-values. You can try to set print.base=TRUE and then look into gsa@results$gslabel$base.results$contrast\$base_method to see which method produces very small p-values. You can try different p-value combining methods by setting the argument "combineMethod" (see egsea.combine()). However, we recommend to look into the top N (N=10-30) gene sets ranked using different EGSEA scores rather than using a p-value cut-off and then use the p-values to support your findings.

Thanks for mentioning GOExpress. We will look into the possibility of integrating it with our package.

Best,

Monther

p.s. I recommend you to use the developmental version of EGSEA as it has been significantly improved since our first release.