Hi,
I have a microarray data that I would like to do GO analysis using GOSeq. However, I have ENTREZ gene ids. When I am trying to convert ENTREZ ids to ENSEMBL ids, I lose several of my genes from the universe, either because of 1:many mappings or no ENSEMBL ids. Is there a way to to use ENTREZ ids as names of the gene universe?
The data frame has also ENSEMBL transcript ids, Gene Name, REFSEQ ids but not ensembl IDs!!!.
I tried all possible options below but non did work:
pwf <- nullp(uniGenes, "mm10", "ensGene",bias.data = lengthData) pwf <- nullp(uniGenes, "mm10", "ensGene") pwf <- nullp(uniGenes, "mm10", "knownGene",bias.data = lengthData) pwf <- nullp(uniGenes, "mm10", "knownGene")
any comments or suggestion would be really helpful.
best,
ilyas.
Hi James,
Thank you very much! That is a good point that I should always keep in mind! Thanks again.
However, I am trying to compare a microarray data with the RNA-Seq data that I have. I am trying to use the same gene ontology package to find which GOs are enriched or over represented. Moreover, the microarray data is from mouse and the RNA-Seq is from Danie rerio. I am trying to find common pathways, Gene enrichment. Do you have any suggestion for that? Especially, gene ontology package or any strategy?
best,
ilyas.
I can think of two ways you could do what you want. First is to do the GO analysis that you are proposing, and see what terms are over-represented in each experiment. Second would be to do some form of gene set testing on both experiments and look for consistent pathways. In other words, you could use something like
romer
from the limma package on both experiments (on a reasonable battery of gene sets - I am not sure all of the Broad sets are particularly useful) and then look for overlaps.