EGSEA: how does the program generate the value for the up or down regulation of gene sets or pathways
1
0
Entering edit mode
yingchen • 0
@yingchen-11543
Last seen 4.6 years ago

Hi guys,

I tried the EGSEA package and was hoping to find a better way to do the GSEA. The output column Direction under gsa$mylabel$test.results$mycontrast gives values 1,0, -1. I assume that -1 means the gene set/pathway is down-regulated and 1 means it's up-regulated. My problem is that most of the gene sets/pathways in my test study are -1, and thus down-regulated, which is not consistent with the results I got with other programs such as GAGE (which is one test in EGSEA). I read the pre-print and found no clue.

Any suggestion?

Thanks a lot,

Ying 

pathways gsea egsea gage • 859 views
ADD COMMENT
0
Entering edit mode
@monther-alhamdoosh-10001
Last seen 22 months ago
Australia/Melbourne/CSL Limited

Hi Ying, 

Thanks for trying out our package! The Direction column in EGSEA is calculated based on the logFC values that are calculated using limma::topTable (if it was not provided). We simply count the number of genes that are up- and down-regulated in the gene set and make a decision based on the direction of the majority. Note that the  argument "logFC.cutoff" is used in this calculation, which is 0 by default.

To closely see the logFC values that are used in the calculation, click on the "Interpret Results" link in the EGSEA report and look into the CSV files of the gene sets of interest. 

Hope this helps. 

Best,

Monther 

ADD COMMENT
0
Entering edit mode

Hi Monther,

Thanks a lot for the explanation!

I have another question, what is the recommended cut-off for significant DE gene sets and pathways? When I used GAGE for the same data set, in 1 contrast, there are only 13 kegg pathways with q.val <= 0.01. When I did the analysis with EGSEA, the q.val are much more smaller ~ 10-E7 and DE pathways are way more than 13.

Another thing, is it possible to integrate GOexpress into you ensemble?

Thanks a lot,

Ying

 

 

 

ADD REPLY
0
Entering edit mode

Hi Ying, 

No worries! That's right. EGSEA produces p-values that are much smaller than individual methods, particularly for gene sets that are significant in the majority of base methods. Another factor that affects the scale of EGSEA p-values is the individuals methods that are used as some of them produce very small p-values. You can try to set print.base=TRUE and then look into gsa@results$gslabel$base.results$contrast$base_method to see which method produces very small p-values. You can try different p-value combining methods by setting the argument "combineMethod" (see egsea.combine()). However, we recommend to look into the top N (N=10-30) gene sets ranked using different EGSEA scores rather than using a p-value cut-off and then use the p-values to support your findings. 

Thanks for mentioning GOExpress. We will look into the possibility of integrating it with our package. 

Best,

Monther 

p.s. I recommend you to use the developmental version of EGSEA as it has been significantly improved since our first release. 

ADD REPLY
0
Entering edit mode

Hi Monther, It looks like any data files generated on the level of individual genes (e.g. the heatmap.csv files / "Interpret Results" downloads) are missing the actual Entrez gene ID column in the output. Could you please check on that? I am using EGSEA.1.10.1 on R.3.5.1. Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 457 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6