Question: Perform limma based gene-set testing for a two-group comparison in a microarray dataset regarding specific biological processes
gravatar for svlachavas
11 months ago by
Greece/Athens/National Hellenic Research Foundation
svlachavas700 wrote:

Dear Community,

based on some initial in vitro experiments, and a subsequent cancer microarray dataset analysis in R, i would like to perform some gene-set tests, for specific pathways and ontologies, regarding my phenotype of interest. Briefly, based on a two-group condition, we are mostly interested in identifying biological processes related to neutrophils, and subsequently more generally to inflammation. So the two major approaches under consideration:

A) Have identified through Gene Ontology Consortium, 7 GO-biological processes that are related to netrophils (

B) The C7 immunologic signatures from WHEI (rdata files)

My major questions are:

1) In the context of microarrays, especially for the first part of the specific GOs: fry would be more appropriate, or mroast ? Alternatively,

would mroast be more suitable for the second part with the many immunologic gene sets ?

2) My second issue, is more specific with the microarray platform and annotation:

in detail, the microarray platform is the Agilent SurePrint G3 Human GE v2 8x60k Microarray (Array Design A-MEXP-2320),

for which as no R annotation package was available, i have downloaded the latest gene symbol annotation from

Thus, as both of the above approaches need Entrez Gene ids, how could i proceed ? as my expression matrix, has unique gene symbols in the rows ? Below, is a small code chunk from the final limma part:


23339   119

IRX1                                      4.979257
SAA1                                      7.548621
H19                                      13.150892
MBP                                       8.240486
SAA2                                      6.692976
CHGA                                      7.527782.....

condition <- factor(final$targets$,
levels = c("LOW.UBE2D3","HIGH.UBE2D3"))

design <- model.matrix(~condition)

fit <- lmFit(final,design)...


Thank you in advance,



ADD COMMENTlink modified 11 months ago by Gordon Smyth38k • written 11 months ago by svlachavas700
Answer: Perform limma based gene-set testing for a two-group comparison in a microarray
gravatar for Gordon Smyth
11 months ago by
Gordon Smyth38k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth38k wrote:

1) With 7 particular GO terms, I would use mroast. Why not? roast is designed for focused gene set tests. fry is an approximation to mroast but, with only 7 terms, you may as well use roast itself.

For B) I would use camera.

2) Personally, I use alias2SymbolUsingNCBI() to convert gene symbols to Entrez Gene Ids and anything else I need. For example:

> Symbols <- c("IRX1","SAA1","H19","MBP","SAA2","CHGA")
> alias2SymbolUsingNCBI(Symbols, "Homo_sapiens.gene_info")
      GeneID Symbol                                    description
14710  79192   IRX1                            iroquois homeobox 1
5055    6288   SAA1                               serum amyloid A1
20753 283120    H19 H19, imprinted maternally expressed transcript
3388    4155    MBP                           myelin basic protein
5056    6289   SAA2                               serum amyloid A2
925     1113   CHGA                                 chromogranin A
ADD COMMENTlink modified 11 months ago • written 11 months ago by Gordon Smyth38k

Dear Gordon, thank you very much for the very useful comment-i have used in the past-based also on your suggestion-alias2SymbolTable, but i haven't checked that alias2SymbolUsingNCBI() returns also GeneIDs-

moreover, regarding my initial question, concerning the type of gene set ? you would choose for example one "type" of test for each procedure ? that is, fry for the specific GOs, and mroast for the high number of gene sets ?

ADD REPLYlink written 11 months ago by svlachavas700

Dear Gordon, thank you for your updates for my first question part-however, I'm facing a specific downstream issue:

Symbols <- rownames(final)
dat <- alias2SymbolUsingNCBI(Symbols, "Homo_sapiens.gene_info")

      GeneID Symbol                                    description
14710  79192   IRX1                            iroquois homeobox 1
5055    6288   SAA1                               serum amyloid A1
20752 283120    H19 H19, imprinted maternally expressed transcript
3388    4155    MBP                           myelin basic protein
5056    6289   SAA2                               serum amyloid A2
925     1113   CHGA                                 chromogranin A

rownames(final) <- as.character(dat$GeneID) # have entrez gene ids
[1] "79192"  "6288"   "283120" "4155"   "6289"   "1113"  

But afterwards, while loading the GO rdata from WEHI ( gene sets:



 [1] "5153"  "4929"  "4129"  "1815"  "6870"  "5071"  "1312"  "3350" 
 [9] "2861"  "3251"  "1141"  "6622"  "6531"  "18"    "1812"  "25953"
[17] "11315"

 [1] "23539"  "9121"   "9122"   "159963" "133418" "6566"   "9194"  
 [8] "387700" "201232" "9120"   "9123"   "162515"

 [1] "5432"  "5439"  "9150"  "7936"  "25920" "51773" "5431"  "5433" 
 [9] "5436"  "5435"  "5430"  "22938" "1105"  "5440"  "1025"  "3725" 
[17] "5434"  "904"   "51176" "5437"  "2963"  "6829"  "3249"  "4851" 
[25] "2033"  "6827"  "5441"  "5438"  "6882"  "6598"  "5216"  "7469" 
[33] "51193" "6597"  "29969" "51497" "6667"  "2962"  "7023" 


However, how could i subset this list, for the specific BP terms, as my GO identifiers are in a different form ? [

for example, the GO:0070488, which has the name neutrophil aggregation ?

Or my approach is incorrect, and these GO gene sets could not contain the above specific GOs, as they are different, grouped together or omitted, based on the relative description ? (

and i should follow another approach ?

ADD REPLYlink modified 11 months ago • written 11 months ago by svlachavas700
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 157 users visited in the last hour