Gene ID conversion for pathway enrichment analysis of differentially expressed genes
1
0
Entering edit mode
@f291ed17
Last seen 2 days ago
United States

Hello Community Members

I am facing problems in setting up pathway enrichment analysis for the differentially expressed genes because of problems with Gene Ids. I tried using DAVID but the species that I am using is not listed there.

In brief, I used the annotation file (ggf3) from https://bacteria.ensembl.org/Desulfovibrio_alaskensis_g20_gca_000012665/Info/Index/ for RNA seq data analysis. I have the list of up and down-regulated genes. I am trying to do gene enrichment pathway analysis for the up and down-regulated genes using various online platforms such as DAVID, CPDB, and Shinygo. The problem that I am facing is that none of these online platforms are accepting the gene ids from the gene annotation file I obtained from ebi. All online platforms require Ensembl gene ids and I am unable to convert.

Species: Desulfovibrio alaskensis G20

Genome Annotation file link: https://bacteria.ensembl.org/Desulfovibrio_alaskensis_g20_gca_000012665/Info/Index/

Any help will be greatly appreciated and very helpful in my research.

GeneIDConversion AnnotationForge • 141 views
ADD COMMENT
0
Entering edit mode

This was originally posted on Biostars: https://www.biostars.org/p/9462159/

ADD REPLY
1
Entering edit mode
@james-w-macdonald-5106
Last seen 12 hours ago
United States

This isn't really a Bioconductor question because, well, you aren't using any Bioconductor packages. Anyway, this is probably a problem with the sites you are trying to use. For example, I did this:

> tx <- makeTxDbFromGFF("Desulfovibrio_alaskensis_g20_gca_000012665.ASM1266v1.49.gff3.gz")
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
> gns <- genes(tx)
> gns
GRanges object with 3258 ranges and 1 metadata column:
             seqnames          ranges strand |     gene_id
                <Rle>       <IRanges>  <Rle> | <character>
  Dde_0001 Chromosome        189-1499      + |    Dde_0001
  Dde_0002 Chromosome       1642-2796      + |    Dde_0002
  Dde_0003 Chromosome       2797-5190      + |    Dde_0003
  Dde_0004 Chromosome       5212-7647      + |    Dde_0004
  Dde_0005 Chromosome       7657-8469      + |    Dde_0005
       ...        ...             ...    ... .         ...
  Dde_4053 Chromosome 3628781-3628993      - |    Dde_4053
  Dde_4054 Chromosome 3723584-3723736      - |    Dde_4054
  Dde_4055 Chromosome 2785088-2785435      + |    Dde_4055
  Dde_4056 Chromosome 3148360-3148599      + |    Dde_4056
  Dde_4057 Chromosome 3371759-3372040      + |    Dde_4057
  -------
  seqinfo: 1 sequence from an unspecified genome; no seqlengths

> cat(head(names(gns), 20), sep = "\n")
Dde_0001
Dde_0002
Dde_0003
Dde_0004
Dde_0005
Dde_0009
Dde_0011
Dde_0012
Dde_0013
Dde_0014
Dde_0015
Dde_0016
Dde_0017
Dde_0019
Dde_0020
Dde_0021
Dde_0022
Dde_0023
Dde_0024
Dde_0028

And pasted those IDs into DAVID, which promptly told me that they aren't recognizable. But those are Ensembl Gene IDs (try pasting any of them into the search at bacteria.ensembl.org)! So the issue most likely is that DAVID doesn't have GO terms for this particular bacterium.

0
Entering edit mode

Thank you for the clarification, James. I cannot find the GO terms associated with the genes on Uniprot or Geneontology.org. Do you know if there is a way to generate GO terms from a list of genes? Any help would be greatly appreciated. Thank you

0
Entering edit mode

You could try blast2go.

0
Entering edit mode

FWIW: in principle you can also obtain this info from Uniport by a manual query, after which you can download the results in a single file. Next step would be extracting the relevant info/columns from that file... (Gene names (ordered locus ) and Gene ontology IDs).

ADD REPLY

Login before adding your answer.

Traffic: 309 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6