Gene Set Enrichment Analysis with Limma
1
1
Entering edit mode
jacorvar ▴ 40
@jacorvar-8972
Last seen 6 months ago
European Union

Dear BioC community,

I have an ExpressionSet object called eset and a list of DE genes called deGenes, for which I want to make a GSEA.

Then I create a list of gene sets according to the ontology "molecular function".

GOobj <- annFUN.org(whichOnto = 'MF', feasibleGenes = featureNames(eset), mapping = annotation(eset),ID = 'entrez')

​​Next I want to make a gene set enrichment analysis, but I do not really know if this is the proper way:

r <- romer(y = exprs(eset), index = ids2indices(GOobj, deGenes), design=design, contrast=2, nrot=99)

Is romer function usually employed when you have a list of DE genes or am I using a wrong method?

limma romer gene set analysis • 7.8k views
ADD COMMENT
2
Entering edit mode
@gordon-smyth
Last seen 5 hours ago
WEHI, Melbourne, Australia

No, the idea of gene set analysis (e.g., as implemented in the functions romer, roast, fry and camera) is that you don't need a list of DE genes. Gene set analysis allows you to interpret your results in terms of gene sets or pathways instead of genes.

If you start with a list of DE genes, then an old-fashioned gene ontology analysis is more usual. Try this in limma:

fit <- lmFit(eset, design)
fit <- eBayes(fit)
go <- goana(fit, coef=2)
topGO(go)

This will work if you have Entrez Gene IDs set as the row names of your eset.

ADD COMMENT
0
Entering edit mode

Hi Gordon,

It gives me a huge table... I assume I must adjust p-values given by goana, right? Do you think it make sense to limit the table to those terms with a size smaller than 1000? Otherwise I get very general GO terms...

Thanks

ADD REPLY
0
Entering edit mode

Yes, the p-values need adjustment in principle, but I prefer to do this informally -- I only take notice of p-values that are less than 1e-8 and preferably smaller.

I like to get all the significant GO terms of all sizes in the topGO table, but you can obviously give more attention to the more specific terms if you want when you interpret the table.

You will only get a huge number of terms with very small p-values if you have a huge number of DE genes. In that case you might use treat(fit, lfc=log2(1.3)) instead of eBayes(fit) in the above code to restrict to genes with large fold changes.

You could also try:

kegg <- kegga(fit, coef=2)
topKEGG(kegg)
ADD REPLY

Login before adding your answer.

Traffic: 774 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6