Hello! I'm using the workflow RnaSeqGeneEdgeRQL to analyse some RNASeq data, and I've by now arrived to the end of my analysis, missing only the pathway analysis to contextualize genes with different expression levels.
The workflow itself is studied for mouse genes, and suggests, at a point, to import Entrez Gene Ids from the org.Mm.eg.db package (for mouse) as follows.
library(org.Mm.eg.db) y$genes$Symbol <- mapIds(org.Mm.eg.db, rownames(y), keytype="ENTREZID", column="SYMBOL") head(y$genes)
But I'm working on a not-so-common organism, for which no default packages are available.Then, I skipped this step, and I was able to perform a statistical analysis anyway, without adding annotation data. I only required a .gff file to count reads abundances, in relation to different genes, but it is now specifically required for the identified genes to have an Entrez ID to continue with GO and KEGG analysis, rather than the name they had in the .gff file. Is there any way I can add my own IDs? And specifically retrieve Gene IDs for the species I'm working on, rather than using the default mouse package?
That's what I get if I look them up on Entrez, but can't retrieve the codes, nor I have any idea how to turn this list into an importable file...
The Entrez IDs aren't included in my gff.
The goana function, that I'm going to use for GO analysis, uses genomes for which a package is available (like Mm, which refers to mouse genome), but will give no results because of the missing IDs in the tr object.
go <- goana(tr, species="Mm")
And so does kegga, for KEGG pathway analysis.
keg <- kegga(tr, species="Mm") topKEGG(keg, n=15, truncate=34)
That's what I get, and as you can see my previous tr (at the top of the screenshoot) doesn't have gene ids but gene numbers from my gff.
This is how a tr object is supposed to look like in the workflow, with the Gene ID being the first number of each row.
Thanks in advance!