Question

usage of topGO after limma

0

Entering edit mode

I.V. Lynn • 0

@iv-lynn-20357

Last seen 5.1 years ago

Russia/Irkutsk/SIPPB SB RAS

Hi!

I'm trying to carry out GO-enrichment analysis of microarray data. I can't understand how can I adopt data after limma for topGO input. For example: I have a data.frame, which contains gene symbols, expression values, t, B, adjusted p.values - basically the results of

>trgts <- readTargets("targets4.csv", sep = ";")
>rough <-  read.maimages(trgts, source="agilent", 
                        columns = list(R ="rDyeNormSignal", G = "gDyeNormSignal",rIsFeatNonUnifOL = "rIsFeatNonUnifOL", >gIsFeatNonUnifOL="gIsFeatNonUnifOL",rIsBGNonUnifOL= "rIsBGNonUnifOL",gIsBGNonUnifOL="gIsBGNonUnifOL",
                                       rIsFeatPopnOL="rIsFeatPopnOL",gIsFeatPopnOL="gIsFeatPopnOL",rIsBGPopnOL= "rIsBGPopnOL",
                                       gIsBGPopnOL="gIsBGPopnOL", rIsSaturated="rIsSaturated",gIsSaturated="gIsSaturated"), 
                        other.columns =  c("rIsFeatNonUnifOL","gIsFeatNonUnifOL", "rIsBGNonUnifOL","gIsBGNonUnifOL",
                                           "rIsFeatPopnOL","gIsFeatPopnOL", "rIsBGPopnOL",
                                           "gIsBGPopnOL", "rIsSaturated","gIsSaturated"), 
                        annotation = c("accessions","chr_coord","Sequence", 
                                       "ProbeUID", "ControlType", "ProbeName", "GeneName","SystematicName"
                                       , "Description"))
roughbet = normalizeBetweenArrays(rough,method="Aquantile")
roughave <- avereps(roughbet,ID=roughbet$genes$ProbeName)
design <- modelMatrix(trgts, ref="Col0")

>fitRC <- lmFit(roughave, design)
>fitRC <- eBayes(fitRC)

>signifC = topTable(fitRC, coef = "mut1", lfc = 1, p.value = 0.05,adjust.method = "BH", number = Inf)
>signifCC = signifCC <- signifC[signifC$ControlType == 0,]

#the next function makes annotation from agilent database and cbind info about probes, including GO_IDs.
>agilentannC <- function(x) {
  for (i in 1:nrow(x)) { x$ID[i] <- (which(AGIDB2$ID == x$ProbeName[i]))}
  AGIDBcutC <- AGIDB2[x$ID,]
  XannotateC <<- cbind(x, AGIDBcutC)
}
agilentannC(signifCC)
CCC <- XannotateC
row.names(CCC) <- CCC$GENE_SYMBOL
PREPAREDC <<- CCC
mut1data <- PREPAREDC

So I have all this - a table contains genes of interest, selected by lfc and p.values, their p.v.'s, LogFC, aveExp, and even GO IDs - and really can't understand how to make topGOdata of it! Please help!!

> sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=Russian_Russia.1251  LC_CTYPE=Russian_Russia.1251    LC_MONETARY=Russian_Russia.1251
[4] LC_NUMERIC=C                    LC_TIME=Russian_Russia.1251    

attached base packages:
 [1] grid      stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] limma_3.38.3         Rgraphviz_2.26.0     hgu95av2.db_3.2.3    org.Hs.eg.db_3.7.0   topGO_2.34.0        
 [6] SparseM_1.77         GO.db_3.7.0          AnnotationDbi_1.44.0 IRanges_2.16.0       S4Vectors_0.20.1    
[11] Biobase_2.42.0       graph_1.60.0         BiocGenerics_0.28.0 

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0         bit_1.1-14         lattice_0.20-38    blob_1.1.1         tools_3.5.3        DBI_1.0.0         
 [7] matrixStats_0.54.0 yaml_2.2.0         bit64_0.9-7        digest_0.6.18      BiocManager_1.30.4 memoise_1.1.0     
[13] RSQLite_2.1.1      compiler_3.5.3     pkgconfig_2.0.2

limma topGO microarray • 1.3k views

ADD COMMENT • link updated 5.1 years ago by Gordon Smyth 50k • written 5.1 years ago by I.V. Lynn • 0

0

Entering edit mode

I go over a working example on Biostars, here: https://www.biostars.org/p/350710/

If your genes are not HGNC symbols, then you can use biomaRt to convert them. topGO also works with Ensembl gene IDs and Entrez identifiers.

ADD REPLY • link 5.1 years ago Kevin Blighe ★ 3.9k

0

Entering edit mode

Thank you for your reply. But I already have GO ID's from my annotation function. And my experiment deals with Arabidopsis thaliana agilent microarray. So it's TAIR gene symbols, like AT5G15324, there. The problem is I don't understand how to put my data in topGOdata format.

GOdata <- new("topGOdata", ontology="BP", allGenes=???, annot = ???, GO2genes=???, geneSel=selection, nodeSize=10)

If I run it this way:

GOdata <- new("topGOdata", ontology="BP", allGenes=named vector of genes' p.values, annot = ???, genes2GO= data.frame contains genesymbol and goid columns, geneSel=I don't think I need It. my genes data is already a selection, nodeSize=10)

it doesn't work at all.

It would be the best, if there is some way to construct topGOdata manually.

ADD REPLY • link 5.1 years ago I.V. Lynn • 0

0

Entering edit mode

You may follow this previous example on Biostars: https://www.biostars.org/p/250927/#250936

ADD REPLY • link 5.1 years ago Kevin Blighe ★ 3.9k