Hi All,
Sorry if this is too basic, I have read through a bunch of forum posts already but I don't find an answer to this.
I want to run GOstat to calculate GO term enrichment in Differentially expressed genes (DEGs) vs. a background gene Universe which = the reference transcriptome. [I work with a non-model organism].
I have GO annotations from Blast2GO for my DEGs, exported in GOstat format - which formats the GO term IDs without the "GO:00.." prefix. eg: "3567" - versus "GO:0003567" which is how they usually appear.
Because of this I formatted the GO terms in my data.frame that contains the GO terms and evidence codes for the gene Universe the same way.
However, in the Vignette for GOstats with unsupported model organisms the example shows the GO terms in the goframeData data.frame with the usual "GO:00.." format.
Now I'm confused as to how I should have my GO terms formatted for the GOstats to work? I haven't yet tried to run the HyperG test yet because I want to be sure I have all the input files I need first. Can anyone please tell me which format the GO terms need to be in? If it weren't for the exported format of the Blast2Go lists I probably wouldn't have even thought about this.
Cheers
Steph
I assume you are talking about this blast2go? If so, questions about their output should be directed to them, not us.