Question: Gene Set Enrichment Analysis with Limma
1
3.8 years ago by
jacorvar40
European Union
jacorvar40 wrote:

Dear BioC community,

I have an ExpressionSet object called eset and a list of DE genes called deGenes, for which I want to make a GSEA.

Then I create a list of gene sets according to the ontology "molecular function".

GOobj <- annFUN.org(whichOnto = 'MF', feasibleGenes = featureNames(eset), mapping = annotation(eset),ID = 'entrez')

​​Next I want to make a gene set enrichment analysis, but I do not really know if this is the proper way:

r <- romer(y = exprs(eset), index = ids2indices(GOobj, deGenes), design=design, contrast=2, nrot=99)

Is romer function usually employed when you have a list of DE genes or am I using a wrong method?

limma gene set analysis romer • 4.3k views
modified 3.8 years ago by Gordon Smyth38k • written 3.8 years ago by jacorvar40
Answer: Gene Set Enrichment Analysis with Limma
2
3.8 years ago by
Gordon Smyth38k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth38k wrote:

No, the idea of gene set analysis (e.g., as implemented in the functions romer, roast, fry and camera) is that you don't need a list of DE genes. Gene set analysis allows you to interpret your results in terms of gene sets or pathways instead of genes.

If you start with a list of DE genes, then an old-fashioned gene ontology analysis is more usual. Try this in limma:

fit <- lmFit(eset, design)
fit <- eBayes(fit)
go <- goana(fit, coef=2)
topGO(go)


This will work if you have Entrez Gene IDs set as the row names of your eset.

Hi Gordon,

It gives me a huge table... I assume I must adjust p-values given by goana, right? Do you think it make sense to limit the table to those terms with a size smaller than 1000? Otherwise I get very general GO terms...

Thanks

Yes, the p-values need adjustment in principle, but I prefer to do this informally -- I only take notice of p-values that are less than 1e-8 and preferably smaller.

I like to get all the significant GO terms of all sizes in the topGO table, but you can obviously give more attention to the more specific terms if you want when you interpret the table.

You will only get a huge number of terms with very small p-values if you have a huge number of DE genes. In that case you might use treat(fit, lfc=log2(1.3)) instead of eBayes(fit) in the above code to restrict to genes with large fold changes.

You could also try:

kegg <- kegga(fit, coef=2)
topKEGG(kegg)