I am working on non model species, and now I want to use GOseq to perform GO and KEGG enrichment analysis.
I have found a list of DEGs using DESeq2, FDR <= 0.05, |log2FC| => 1. However, some of those DEGs (say 50%) haven't associated with any GO and KEGG annotation. Will it affect the result a lots?
Besides, I found GO level from 1 to 15 in the GO annotation file. Do I need to input all those GO level, or can I only input relatively general GO level (say Level 2-6)? Because some of GO terms only have one or two associated genes, is that meaningful to include them. I have tried GOEAST and GAGE for enrichment analysis. Both of them will set a cut off the number of gene associated in GO term. And I am not sure GOseq can have gene set size option or not. And I am not sure lots of GO terms may affect the enrichment analysis and make the p values larger.