Question

ISO help with goana

0

Entering edit mode

mat149 ▴ 80

@mat149-11450

Last seen 4 months ago

United States

Hello,

I am running a microarray analysis using a zebrafish platform pd.zebgene.1.1.st). After performing the moderated t-test in limma (lmFit/eBayes), I am using the resulting MArrayLM object to map the returned Entrez ID's to "dre" (KEGG) and "dr" (GO) databases.

I need help with:

1) Restricting only the probesets with: adjusted P.value < 0.01 and logFC > 1.5, < -1.5 from the fit object when passing the goana function. Can this be done using the topTable?

2) How to determine which genes are assigned to each enriched GO term?

For example I would like to describe which differentially expressed probesets are returned from (GO:0050877)

	Term	Ont	N	Up	Down	P.Up	P.Down
GO:0050877	neurological system process	BP	258	65	47	1.57E-20	0.029498
GO:0008066	glutamate receptor activity	MF	22	0	18	1	1.44E-12
GO:0005096	GTPase activator activity	MF	114	2	41	0.997429	2.5E-09
GO:0016917	GABA receptor activity	MF	16	0	13	1	2.56E-09

Thanks for any help you can provide,

Matt

library(limma)
design = model.matrix(~ 0 + f)
colnames(design)=c("control","morphant","rescue")
data.fit = lmFit(eset,design)
contrast.matrix <- makeContrasts(morphant-control,rescue-control,morphant-rescue,levels=design)
data.fit.con <- contrasts.fit(data.fit,contrast.matrix)
data.fit.eb <- eBayes(data.fit.con)

entzzvec<-as.vector(data.fit.eb$genes$ENTREZID)
MOkegg<-kegga(data.fit.eb,coef=1,geneid=entzzvec,FDR=0.01,species.KEGG="dre",convert=TRUE)
MOtop<-topKEGG(MOkegg, sort = NULL, number = 50, truncate.path = NULL)

limma • 1.8k views

ADD COMMENT • link 9.2 years ago mat149 ▴ 80

0

Entering edit mode

The code you give doesn't use the goana() function. There is also no need for the as.vector() step.

ADD REPLY • link 9.2 years ago Gordon Smyth 53k

0

Entering edit mode

sorry, I was not paying attention and posted the kegg script by accident.

entzzvec<-as.vector(data.fit.eb$genes$ENTREZID)

MOgo <- goana(data.fit.eb, coef=1,FDR=0.01,geneid=data.fit.eb$genes$ENTREZID,species="Dr")
topMOgo<-topGO(MOgo,number=50)

I had to use the as.vector step to map gene symbols to Entrez ID's in org.Dr.eg.db:

eset<-rma(CELdat, background=TRUE, normalize=TRUE, subset=NULL, target="core")    

library(affycoretools)

eset <- annotateEset(eset, annotation(eset))

library(org.Dr.eg.db)
fd <- fData(eset)
fd$ENTREZID <- mapIds(org.Dr.eg.db, as.character(fd$SYMBOL), "ENTREZID","SYMBOL",multiVals="first")
fData(eset) <- fd

ADD REPLY • link 9.2 years ago mat149 ▴ 80

0

Entering edit mode

Actually you don't ever need as.vector(). You used as.character() for the mapping, not as.vector(). With goana() you could use:

goana(data.fit.eb, coef=1, FDR=0.01, geneid="ENTREZID", species="Dr")

which is a bit shorter.

ADD REPLY • link 9.2 years ago Gordon Smyth 53k

0

Entering edit mode

Thank you, Gordon. With your help, I figured out How to determine which genes are assigned to each enriched GO term. I will do some reading regarding the comments/links you posted and try to determine which method might be best for my dataset

ADD REPLY • link 9.2 years ago mat149 ▴ 80

score 1 · Answer 1 · 2016-12-20

1) Restricting only the probesets with: adjusted P.value < 0.01 and logFC > 1.5, < -1.5 from the fit object when passing the goana function.

No, goana() doesn't allow you to apply a logFC cutoff as well as an FDR cutoff. The reason why it isn't allowed is that log-fold-change cutoffs interact badly with FDR cutoffs, making the FDR calculations no longer valid. See the Note on the help page for topTable() which recommends against this.

If you want to give preference to genes with larger fold changes, use treat() instead. This approach incorporates a fold change threshold into the statistical test. See https://f1000research.com/articles/5-1438 or https://f1000research.com/articles/5-1408 for example workflows.

2) How to determine which genes are assigned to each enriched GO term?

There are a few ways to do this. One way would be to create a data.frame of gene-GO links:

> library(org.Dr.eg.db)
> Gene.GO <- toTable(org.Dr.egGO2ALLEGS)

and then to proceed as you did for KEGG pathways: How to select the genes mapped to an enriched KEGG pathway (kegga) The data.frame Gene.GO is equivalent to the data.frame GK that we made for KEGG, although the column names are slightly different.