Can we use Entrez IDs for GOSeq?
1
0
Entering edit mode
@mehmet-ilyas-cosacak-9020
Last seen 6.1 years ago
Germany/Dresden/ CRTD - DZNE

Hi,

I have a microarray data that I would like to do GO analysis using GOSeq. However, I have ENTREZ gene ids. When I am trying to convert ENTREZ ids to ENSEMBL ids, I lose several of my genes from the universe, either because of 1:many mappings or no ENSEMBL ids. Is there a way to to use ENTREZ ids as names of the gene universe?

The data frame has also ENSEMBL transcript ids, Gene Name, REFSEQ ids but not ensembl IDs!!!.

I tried all possible options below but non did work:

pwf <- nullp(uniGenes, "mm10", "ensGene",bias.data = lengthData)
pwf <- nullp(uniGenes, "mm10", "ensGene")
pwf <- nullp(uniGenes, "mm10", "knownGene",bias.data = lengthData)
pwf <- nullp(uniGenes, "mm10", "knownGene")

any comments or suggestion would be really helpful.

best,

ilyas.

 

goseq entrez gene identifiers • 1.3k views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 16 hours ago
United States

There is no length bias for microarray data, so there is no reason to use a package designed for RNA-Seq to do your GO analysis. You can use GOstats directly, as it requires Entrez Gene IDs anyway.

ADD COMMENT
0
Entering edit mode

Hi James,

Thank you very much! That is a good point that I should always keep in mind! Thanks again.

However, I am trying to compare a microarray data with the RNA-Seq data that I have. I am trying to use the same gene ontology package to find which GOs are enriched or over represented. Moreover, the microarray data is from mouse and the RNA-Seq is from Danie rerio. I am trying to find common pathways, Gene enrichment. Do you have any suggestion for that? Especially, gene ontology package or any strategy?

best,

ilyas.

 

ADD REPLY
1
Entering edit mode

I can think of two ways you could do what you want. First is to do the GO analysis that you are proposing, and see what terms are over-represented in each experiment. Second would be to do some form of gene set testing on both experiments and look for consistent pathways. In other words, you could use something like romer from the limma package on both experiments (on a reasonable battery of gene sets - I am not sure all of the Broad sets are particularly useful) and then look for overlaps.

ADD REPLY

Login before adding your answer.

Traffic: 798 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6