Search
Question: EGSEA usage with microarray data (with vooma)?
0
gravatar for Pekka Kohonen
10 months ago by
Pekka Kohonen190
Sweden
Pekka Kohonen190 wrote:

Hi,

It says in the paper of EGSEA that it only works with RNA-seq. But it uses EL-objects produced by voom as input, and these can now be produced from microarray data as well using vooma or other functions in limma. So does EGSEA also now work with microarray data? I will try this out, just wanted to put the question out there.

Best, Pekka

ADD COMMENTlink modified 9 months ago • written 10 months ago by Pekka Kohonen190
2
gravatar for Monther Alhamdoosh
10 months ago by
Australia/Melbourne/CSL Limited
Monther Alhamdoosh40 wrote:

Hi Pekka,

Thanks for your questions. We do not mention that EGSEA works with microarray datasets since some of the base methods' parameters need to be tuned to suite microarray datasets and we have not tested it with this type of data. We will work on this soon. Let us know if things work well with the current release! 

Cheers,

Monther

ADD COMMENTlink written 10 months ago by Monther Alhamdoosh40

Hi  Monther,

I have done some testing with EGSEA using microarray data (a couple of thousand analyses). And as far as I can tell it is working fine! A few remarks.

1. symbolsMap = y$genes[, c(1, 3)] needs to be changed to something like symbolsMap=row.names(featureData(eSet)@data). Apparently the vooma object does not include the "genes" slot (genes dataframe of gene annotation, only if counts was a DGEList object). 

2. It does not seem to be doing multi-threading very effectively (processor activity remains at more or less the same level). But I was using the "custom" gene sets option (one gene set at a time). So maybe the parallelization is done differently. But at least in my case it might be better to give the function just 1 thread, split the data into lists and to do the parallelization with the Biocparallel "bplapply".

3. The results object is very complicated. I quite like the "biobroom" Bioconductor package that does "tidy" data frames from limma results objects. I wrote similar routines for my analysis. Egsea results are at: gsa@results$custom$test.results[[c]] where the c is a contrast (need to lapply/do.call over all of the contrasts) and then do the same for the individual methods which are at: gsa@results$custom$base.results[[c]]$ora (for the ora method). 

4. The visualization routine takes enormous amounts of time to run and if you have e.g., 9 contrasts in your dataset which generates a huge number of combinations. But I suppose it is useful for smaller analyses.

5. I wonder if some of the methods (like GSVA and ROAST) should be run in the "absolute i.e., mixed" or the "directional" mode. But I suppose if one cares about that then it is possible to customize the "ensemble" method accordingly.

6. Some methods like the GSVA require at least 10 samples to be effective (don't know if applies to others).

All in all a very useful package! Both for automating the running of lots of methods at the same time and of course for the "ensemble" method. 

 

ADD REPLYlink modified 9 months ago • written 9 months ago by Pekka Kohonen190

Hi Pekka, 

Thank you very much for your valuable feedback! This should help many EGSEA users. I will revisit your suggestions soon and update the package accordingly. 

Cheers,

Monther 

ADD REPLYlink written 9 months ago by Monther Alhamdoosh40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 259 users visited in the last hour