I am facing problems in setting up pathway enrichment analysis for the differentially expressed genes because of problems with Gene Ids. I tried using DAVID but the species that I am using is not listed there.
In brief, I used the annotation file (ggf3) from https://bacteria.ensembl.org/Desulfovibrio_alaskensis_g20_gca_000012665/Info/Index/ for RNA seq data analysis. I have the list of up and down-regulated genes. I am trying to do gene enrichment pathway analysis for the up and down-regulated genes using various online platforms such as DAVID, CPDB, and Shinygo. The problem that I am facing is that none of these online platforms are accepting the gene ids from the gene annotation file I obtained from ebi. All online platforms require Ensembl gene ids and I am unable to convert.
This isn't really a Bioconductor question because, well, you aren't using any Bioconductor packages. Anyway, this is probably a problem with the sites you are trying to use. For example, I did this:
> tx <- makeTxDbFromGFF("Desulfovibrio_alaskensis_g20_gca_000012665.ASM1266v1.49.gff3.gz")
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
> gns <- genes(tx)
GRanges object with 3258 ranges and 1 metadata column:
seqnames ranges strand | gene_id
<Rle> <IRanges> <Rle> | <character>
Dde_0001 Chromosome 189-1499 + | Dde_0001
Dde_0002 Chromosome 1642-2796 + | Dde_0002
Dde_0003 Chromosome 2797-5190 + | Dde_0003
Dde_0004 Chromosome 5212-7647 + | Dde_0004
Dde_0005 Chromosome 7657-8469 + | Dde_0005
... ... ... ... . ...
Dde_4053 Chromosome 3628781-3628993 - | Dde_4053
Dde_4054 Chromosome 3723584-3723736 - | Dde_4054
Dde_4055 Chromosome 2785088-2785435 + | Dde_4055
Dde_4056 Chromosome 3148360-3148599 + | Dde_4056
Dde_4057 Chromosome 3371759-3372040 + | Dde_4057
seqinfo: 1 sequence from an unspecified genome; no seqlengths
> cat(head(names(gns), 20), sep = "\n")
And pasted those IDs into DAVID, which promptly told me that they aren't recognizable. But those are Ensembl Gene IDs (try pasting any of them into the search at bacteria.ensembl.org)! So the issue most likely is that DAVID doesn't have GO terms for this particular bacterium.