EGSEA usage with microarray data (with vooma)?
1
0
Entering edit mode
Pekka Kohonen ▴ 190
@pekka-kohonen-5862
Last seen 6.9 years ago
Sweden

Hi,

It says in the paper of EGSEA that it only works with RNA-seq. But it uses EL-objects produced by voom as input, and these can now be produced from microarray data as well using vooma or other functions in limma. So does EGSEA also now work with microarray data? I will try this out, just wanted to put the question out there.

Best, Pekka

EGSEA gene set testing vooma limma voom gsea • 1.9k views
ADD COMMENT
2
Entering edit mode
@monther-alhamdoosh-10001
Last seen 5.5 years ago
Australia/Melbourne/CSL Limited

Hi Pekka,

Thanks for your questions. We do not mention that EGSEA works with microarray datasets since some of the base methods' parameters need to be tuned to suite microarray datasets and we have not tested it with this type of data. We will work on this soon. Let us know if things work well with the current release! 

Cheers,

Monther

ADD COMMENT
0
Entering edit mode

Hi  Monther,

I have done some testing with EGSEA using microarray data (a couple of thousand analyses). And as far as I can tell it is working fine! A few remarks.

1. symbolsMap = y$genes[, c(1, 3)] needs to be changed to something like symbolsMap=row.names(featureData(eSet)@data). Apparently the vooma object does not include the "genes" slot (genes dataframe of gene annotation, only if counts was a DGEList object). 

2. It does not seem to be doing multi-threading very effectively (processor activity remains at more or less the same level). But I was using the "custom" gene sets option (one gene set at a time). So maybe the parallelization is done differently. But at least in my case it might be better to give the function just 1 thread, split the data into lists and to do the parallelization with the Biocparallel "bplapply".

3. The results object is very complicated. I quite like the "biobroom" Bioconductor package that does "tidy" data frames from limma results objects. I wrote similar routines for my analysis. Egsea results are at: gsa@results$custom$test.results[[c]] where the c is a contrast (need to lapply/do.call over all of the contrasts) and then do the same for the individual methods which are at: gsa@results$custom$base.results[[c]]$ora (for the ora method). 

4. The visualization routine takes enormous amounts of time to run and if you have e.g., 9 contrasts in your dataset which generates a huge number of combinations. But I suppose it is useful for smaller analyses.

5. I wonder if some of the methods (like GSVA and ROAST) should be run in the "absolute i.e., mixed" or the "directional" mode. But I suppose if one cares about that then it is possible to customize the "ensemble" method accordingly.

6. Some methods like the GSVA require at least 10 samples to be effective (don't know if applies to others).

All in all a very useful package! Both for automating the running of lots of methods at the same time and of course for the "ensemble" method. 

 

ADD REPLY
0
Entering edit mode

Hi Pekka, 

Thank you very much for your valuable feedback! This should help many EGSEA users. I will revisit your suggestions soon and update the package accordingly. 

Cheers,

Monther 

ADD REPLY
0
Entering edit mode

Hi Pekka, I also try EGSEA with microarray data. However, I got this error. Would you please have a suggestion?

gsa = egsea.ma(numeric_matrix, vector_group, probe_annotation, contrasts = contrast_matrix, gs.annots = gs.annots, baseGSEAs = baseMethods, sort.by = "avg.rank", num.threads = 4, report = FALSE)
Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent
ADD REPLY

Login before adding your answer.

Traffic: 1028 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6