I wanted to build a heatmap with different sets of genes as columns and GO BP terms as rows. The intensity of the colour of each cell in the heatmap would represent the p-value.

I started by using hyperGTest of package GOstats to calculate p-value for different GO terms. I selected p-value cutoff as 1 cause wanted to get p-values for all terms, even those that were not significant. I did this with the following code for two gene sets (A and B):

`A <- c("86", "51412", "192669", "8289", "57492", "83858", "10533", "10538", "54880", "1230", "947", "79577", "55038", "64866", "1017", "1026", "1063", "1105", "91851", "55118") `

B <- c("79624", "23192", "85003", "151887", "7555", "9267", "8663", "3692", "51359", "90625", "28982", "2597", "81502", "57470", "4223", "57380", "57552", "54539", "55655", "199713", "79031", "123", "5558", "10635", "81892")

library(org.Hs.eg.db)

library(Category)

library(GOstats)

library(GO.db)

genesGO <- list (A,B)

names(genesGO) <- c("A","B")

allgenesGO <- mappedkeys(org.Hs.egGO)

onto.overGO.bp<-list()

for (i in 1:length(genesGO)){

paramsGO <- new("GOHyperGParams", geneIds = genesGO[[i]][genesGO[[i]] %in% allgenesGO], universeGeneIds = allgenesGO, annotation = "org.Hs.eg.db", ontology = "BP", testDirection = "over", pvalueCutoff=1)

hypGO <- hyperGTest(paramsGO)

tgo <- summary(hypGO, pvalue = 1)

onto.overGO <- data.frame(tgo[, 1:2], FDR = p.adjust(tgo[, 2], method = "fdr"),tgo[, 3:7])

onto.overGO.bp[[i]]<-onto.overGO

}

names(onto.overGO.bp) <- c("A","B")

nrow(onto.overGO.bp[[1]])

nrow(onto.overGO.bp[[2]])

However, the number of GO terms obtained varies with the gene set even when selecting p-value as 1: `nrow(onto.overGO.bp[[1]])`

is different from `nrow(onto.overGO.bp[[2]])`

, which makes impossible to build a heatmap.

Is it something missing in the code?