ISO help with goana
1
0
Entering edit mode
mat149 ▴ 80
@mat149-11450
Last seen 4 days ago
United States

Hello,

I am running a microarray analysis using a zebrafish platform pd.zebgene.1.1.st). After performing the moderated t-test in limma (lmFit/eBayes), I am using the resulting MArrayLM object to map the returned Entrez ID's to "dre" (KEGG) and "dr" (GO) databases.

I need help with:

1) Restricting only the probesets with: adjusted P.value < 0.01 and logFC > 1.5, < -1.5 from the fit object when passing the goana function. Can this be done using the topTable?

 

2) How to determine which genes are assigned to each enriched GO term?

For example I would like to describe which differentially expressed probesets are returned from (GO:0050877)

  Term Ont N Up Down P.Up P.Down
GO:0050877 neurological system process BP 258 65 47 1.57E-20 0.029498
GO:0008066 glutamate receptor activity MF 22 0 18 1 1.44E-12
GO:0005096 GTPase activator activity MF 114 2 41 0.997429 2.5E-09
GO:0016917 GABA receptor activity MF 16 0 13 1 2.56E-09

 

Thanks for any help you can provide,

Matt

 

library(limma)
design = model.matrix(~ 0 + f)
colnames(design)=c("control","morphant","rescue")
data.fit = lmFit(eset,design)
contrast.matrix <- makeContrasts(morphant-control,rescue-control,morphant-rescue,levels=design)
data.fit.con <- contrasts.fit(data.fit,contrast.matrix)
data.fit.eb <- eBayes(data.fit.con)

entzzvec<-as.vector(data.fit.eb$genes$ENTREZID)
MOkegg<-kegga(data.fit.eb,coef=1,geneid=entzzvec,FDR=0.01,species.KEGG="dre",convert=TRUE)
MOtop<-topKEGG(MOkegg, sort = NULL, number = 50, truncate.path = NULL)
limma • 1.5k views
ADD COMMENT
0
Entering edit mode

The code you give doesn't use the goana() function. There is also no need for the as.vector() step.

ADD REPLY
0
Entering edit mode

sorry, I was not paying attention and posted the kegg script by accident.

entzzvec<-as.vector(data.fit.eb$genes$ENTREZID)

MOgo <- goana(data.fit.eb, coef=1,FDR=0.01,geneid=data.fit.eb$genes$ENTREZID,species="Dr")
topMOgo<-topGO(MOgo,number=50)

I had to use the as.vector step to map gene symbols to Entrez ID's in org.Dr.eg.db:

eset<-rma(CELdat, background=TRUE, normalize=TRUE, subset=NULL, target="core")    

library(affycoretools)

eset <- annotateEset(eset, annotation(eset))

library(org.Dr.eg.db)
fd <- fData(eset)
fd$ENTREZID <- mapIds(org.Dr.eg.db, as.character(fd$SYMBOL), "ENTREZID","SYMBOL",multiVals="first")
fData(eset) <- fd

 

ADD REPLY
0
Entering edit mode

Actually you don't ever need as.vector(). You used as.character() for the mapping, not as.vector(). With goana() you could use:

goana(data.fit.eb, coef=1, FDR=0.01, geneid="ENTREZID", species="Dr")

which is a bit shorter.

ADD REPLY
0
Entering edit mode

Thank you, Gordon.  With your help, I figured out How to determine which genes are assigned to each enriched GO term.  I will do some reading regarding the comments/links you posted and try to determine which method might be best for my dataset

ADD REPLY
1
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

1) Restricting only the probesets with: adjusted P.value < 0.01 and logFC > 1.5, < -1.5 from the fit object when passing the goana function.

No, goana() doesn't allow you to apply a logFC cutoff as well as an FDR cutoff. The reason why it isn't allowed is that log-fold-change cutoffs interact badly with FDR cutoffs, making the FDR calculations no longer valid. See the Note on the help page for topTable() which recommends against this.

If you want to give preference to genes with larger fold changes, use treat() instead. This approach incorporates a fold change threshold into the statistical test. See https://f1000research.com/articles/5-1438 or https://f1000research.com/articles/5-1408 for example workflows.

2) How to determine which genes are assigned to each enriched GO term?

There are a few ways to do this. One way would be to create a data.frame of gene-GO links:

> library(org.Dr.eg.db)
> Gene.GO <- toTable(org.Dr.egGO2ALLEGS)

and then to proceed as you did for KEGG pathways: How to select the genes mapped to an enriched KEGG pathway (kegga) The data.frame Gene.GO is equivalent to the data.frame GK that we made for KEGG, although the column names are slightly different.

 

ADD COMMENT

Login before adding your answer.

Traffic: 550 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6