I tried the EGSEA package and was hoping to find a better way to do the GSEA. The output column Direction under gsa$mylabel$test.results$mycontrast gives values 1,0, -1. I assume that -1 means the gene set/pathway is down-regulated and 1 means it's up-regulated. My problem is that most of the gene sets/pathways in my test study are -1, and thus down-regulated, which is not consistent with the results I got with other programs such as GAGE (which is one test in EGSEA). I read the pre-print and found no clue.
Thanks a lot,
Thanks a lot for the explanation!
I have another question, what is the recommended cut-off for significant DE gene sets and pathways? When I used GAGE for the same data set, in 1 contrast, there are only 13 kegg pathways with q.val <= 0.01. When I did the analysis with EGSEA, the q.val are much more smaller ~ 10-E7 and DE pathways are way more than 13.
Another thing, is it possible to integrate GOexpress into you ensemble?
Thanks a lot,
No worries! That's right. EGSEA produces p-values that are much smaller than individual methods, particularly for gene sets that are significant in the majority of base methods. Another factor that affects the scale of EGSEA p-values is the individuals methods that are used as some of them produce very small p-values. You can try to set print.base=TRUE and then look into gsa@results$gslabel$base.results$contrast$base_method to see which method produces very small p-values. You can try different p-value combining methods by setting the argument "combineMethod" (see egsea.combine()). However, we recommend to look into the top N (N=10-30) gene sets ranked using different EGSEA scores rather than using a p-value cut-off and then use the p-values to support your findings.
Thanks for mentioning GOExpress. We will look into the possibility of integrating it with our package.
p.s. I recommend you to use the developmental version of EGSEA as it has been significantly improved since our first release.
Hi Monther, It looks like any data files generated on the level of individual genes (e.g. the heatmap.csv files / "Interpret Results" downloads) are missing the actual Entrez gene ID column in the output. Could you please check on that? I am using EGSEA.1.10.1 on R.3.5.1. Thank you!