How to get list of genes that are DE from the goseq analysis results
3
0
Entering edit mode
@darshinimamindla-8843
Last seen 8.7 years ago
United States

Hello All,

I have performed differential expression analysis of RNAseq data for Colon cancer dataset taken  from TCGA. I got a list of differentially expressed genes by using edgeR. I also performed the goseq analysis and got the results for it. Now, I am trying to extract only those genes that are DE from my list of DE list given to goseq after the enrichment analysis. 

I used the following code, but I get an error that is posted below:

#Goseq analysis
genes=as.integer(p.adjust(de.cmn$table$PValue[de.cmn$table$logFC!=0],method="BH")<.05)

names(genes)=row.names(de.cmn$table[de.cmn$table$logFC!=0,])
table(genes)

pwf=nullp(genes,"hg19","knownGene")

GO.wall=goseq(pwf,"hg19","knownGene")

enriched=GO.wall$category[p.adjust(GO.wall$over_represented_pvalue, method="BH")<.05]

biocLite("org.Hs.eg.db")
library(org.Hs.eg.db)
biocLite("GO.db")
library(GO.db)
biocLite("annotate")
library(annotate)
sigCats <- GO.wall[which(GO.wall[,2] < 0.05),]
cats <- sigCats$category
terms <- stack(lapply(mget(cats, GO:0005615, ifnotfound=NA), extracellular space))

ERROR: Error: unexpected symbol in "terms <- stack(lapply(mget(cats, GO:0005615, ifnotfound=NA), extracellular space"

Could someone tell me what is the error about. I would really appreciate your help and thanks in advance.

 

 

 

limma goseq mget edgeR • 2.7k views
ADD COMMENT
0
Entering edit mode

I don't quite follow what you are trying to do. You say that you want to "extract only those genes that are DE from my list of DE" genes, but that doesn't make sense because obviously all the genes in your list of DE genes are DE.

Do you perhaps want to know which of your DE genes belong to a particular GO category?

Are you using Entrez Gene IDs? If not, what type of IDs are you using?

ADD REPLY
0
Entering edit mode

Hello Smyth,

Thanks for the reply. Yes, I am interested in getting only those DE expressed present in that particular category. And, yes I am using Entrez Gene IDs. 

ADD REPLY
0
Entering edit mode
@steve-lianoglou-2771
Last seen 22 months ago
United States

This is not valid R syntax:

terms <- stack(lapply(mget(cats, GO:0005615, ifnotfound=NA), extracellular space))

You might have meant to wrap GO:0005615 and extracellular space in quotes, like so:

terms <- stack(lapply(mget(cats, "GO:0005615", ifnotfound=NA), "extracellular space"))

which will fix your syntax error, but once that's fixed, it's not clear that your mget call will work.

Are you trying to find the list of genes that you found to be DE that belong to a particular GO category that was also found significant via goseq?

 

ADD COMMENT
0
Entering edit mode
@darshinimamindla-8843
Last seen 8.7 years ago
United States

Hi @Steve Lianoglou,

Thanks for the reply. As you said I have put quotes for the GO:TERM and "extracellular space" and still got the error. Yes, I wanted to get the list of genes that are DE from that particular GO category which are found significant through goseq analysis. Could you please tell me how to proceed further. Thanks in advance.

 

ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 8 hours ago
WEHI, Melbourne, Australia

There are many ways to get the genes belonging to a particular GO term, but this is the method I find the quickest:

library(org.Hs.eg.db)
x <- org.Hs.egGO2ALLEGS
Rkeys(x) <- "GO:0005615"
EntrezIDs <- mappedLkeys(x)

Now EntrezIDs contains the Entrez Gene IDs of all the genes in this GO term.

To examine which of these genes are DE, you can subset your edgeR object:

i <- rownames(de.cmn) %in% EntrezIDs
de.cmn[i,]

 

ADD COMMENT

Login before adding your answer.

Traffic: 737 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6