Question

Gene Set Enrichment Analysis with Limma

1

Entering edit mode

jacorvar ▴ 40

@jacorvar-8972

Last seen 2.3 years ago

European Union

Dear BioC community,

I have an ExpressionSet object called eset and a list of DE genes called deGenes, for which I want to make a GSEA.

Then I create a list of gene sets according to the ontology "molecular function".

GOobj <- annFUN.org(whichOnto = 'MF', feasibleGenes = featureNames(eset), mapping = annotation(eset),ID = 'entrez')

Next I want to make a gene set enrichment analysis, but I do not really know if this is the proper way:

r <- romer(y = exprs(eset), index = ids2indices(GOobj, deGenes), design=design, contrast=2, nrot=99)

Is romer function usually employed when you have a list of DE genes or am I using a wrong method?

limma romer gene set analysis • 9.0k views

ADD COMMENT • link updated 10.2 years ago by Gordon Smyth 53k • written 10.2 years ago by jacorvar ▴ 40

score 2 · Answer 1 · 2015-11-20

2

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 8 hours ago

WEHI, Melbourne, Australia

No, the idea of gene set analysis (e.g., as implemented in the functions romer, roast, fry and camera) is that you don't need a list of DE genes. Gene set analysis allows you to interpret your results in terms of gene sets or pathways instead of genes.

If you start with a list of DE genes, then an old-fashioned gene ontology analysis is more usual. Try this in limma:

fit <- lmFit(eset, design)
fit <- eBayes(fit)
go <- goana(fit, coef=2)
topGO(go)

This will work if you have Entrez Gene IDs set as the row names of your eset.

ADD COMMENT • link 10.2 years ago Gordon Smyth 53k

0

Entering edit mode

Hi Gordon,

It gives me a huge table... I assume I must adjust p-values given by goana, right? Do you think it make sense to limit the table to those terms with a size smaller than 1000? Otherwise I get very general GO terms...

Thanks

ADD REPLY • link 10.2 years ago jacorvar ▴ 40

0

Entering edit mode

Yes, the p-values need adjustment in principle, but I prefer to do this informally -- I only take notice of p-values that are less than 1e-8 and preferably smaller.

I like to get all the significant GO terms of all sizes in the topGO table, but you can obviously give more attention to the more specific terms if you want when you interpret the table.

You will only get a huge number of terms with very small p-values if you have a huge number of DE genes. In that case you might use treat(fit, lfc=log2(1.3)) instead of eBayes(fit) in the above code to restrict to genes with large fold changes.

You could also try:

kegg <- kegga(fit, coef=2)
topKEGG(kegg)

ADD REPLY • link 10.2 years ago Gordon Smyth 53k