No enriched GO terms with 1000 more genes
0
0
Entering edit mode
Ruixuan ▴ 10
@ruixuan-23626
Last seen 4.5 years ago
Kyoto University

Hi everyone, I'm doing a GO analysis after finish the statistical test by edgeR.

Before, I did the comparison between group1 vs group2, group1 vs group3, group1 vs group4.

Here the problem came when I compared group1 vs group4, there are 1740 genes showing to be significantly overrepresented in group 4. However, when I used the code below

enrich.go.BP = enrichGO(gene = up_gene.4vs1$GeneID,
                        OrgDb = Acan.OrgDb,
                        keyType = "ENTREZID",
                        ont = "BP", pvalueCutoff = 0.01,
                        qvalueCutoff = 0.05, readable = T)

There is no enriched terms in the result. This code worked well when I compared other groups to group1, so I think there may be no problem on code. Thus, I'm wondering why I got this result? How can I fix it? Is it that I got too many genes which locate in almost all kinds of category so that there is no statistical significant enriched terms? Thank you in advance.


Edited: 2020-06-11 Thanks to the comment by Kevin Blighe. I will show more information below.

The Acan.OrgDb is the one I loaded by using Annotationhub, because my target species "acanthamoeba castellanii" is not a model organism.

hub <- AnnotationHub::AnnotationHub()
amoeba <- query(hub, "Acanthamoeba castellanii")
# title                                                       
# AH65301 | Acanthamoeba castellanii str. Neff transcript information   
# AH73987 | Transcript information for Acanthamoeba castellanii str Neff
# AH74626 | Transcript information for Acanthamoeba castellanii str Neff
# AH81410 | org.Acanthamoeba_castellanii_Neff_strain.eg.sqlite          
# AH81411 | org.Acanthamoeba_castellanii_str._Neff.eg.sqlite            
# AH81412 | org.Acanthamoeba_castellanii_strain_Neff.eg.sqlite 

Here I chose the AH81410 because its Db type is OrgDb.

Acan.OrgDb <- hub[["AH81410"]]
> Acan.OrgDb
OrgDb object:
| DBSCHEMAVERSION: 2.1
| DBSCHEMA: NOSCHEMA_DB
| ORGANISM: Acanthamoeba castellanii_Neff_strain
| SPECIES: Acanthamoeba castellanii_Neff_strain
| CENTRALID: GID
| Taxonomy ID: 1257118
| Db type: OrgDb
| Supporting package: AnnotationDbi

And from colnames(Acan.OrgDb), we could see that it supported ENTREZID.

> columns(Acan.OrgDb)
[1] "ACCNUM"      "ALIAS"       "CHR"         "ENTREZID"    "EVIDENCE"    "EVIDENCEALL" "GENENAME"    "GID"         "GO"          "GOALL"      
[11] "ONTOLOGY"    "ONTOLOGYALL" "PMID"        "REFSEQ"      "SYMBOL"     

Then, I prepared my significant genes list into ENTREZID format. The format is generated by combining ORFID, locus_tag and annotation from files downloaded from NCBI.

Here, the GeneID is recording those id in ENTREZID format.

>up_gene.4vs1
            Locus_tag     ORFID Name      Accession  Start   Stop Strand   GeneID Locus Protein_product Length                                             Protein_Name
1  ACA1_000790  gene5490   Un NW_004457578.1   5136   5699      + 14921342    NA  XP_004343320.1    187                         hypothetical protein ACA1_000790
2  ACA1_001250  gene2057   Un NW_004457658.1   4004  11317      + 14924768    NA  XP_004353303.1   1925                         hypothetical protein ACA1_001250
3  ACA1_001280  gene2060   Un NW_004457658.1  17392  18733      - 14924773    NA  XP_004353305.1    258                         hypothetical protein ACA1_001280
4  ACA1_001300  gene2062   Un NW_004457658.1  20701  23681      - 14924770    NA  XP_004353306.1    599                    fucose1-phosphate guanylyltransferase

You may also notice that there are hypothetical proteins which could blur the prediction. Although there are 691 entries of hypothetical protein, there are still (1049/1740) entries left.

Thus, I'm a little bit confused about the results from enrichGO showing no enriched GO terms.

type(up_gene.4vs1)
# [1] "character"
type(up_gene.4vs1$GeneID)
# [1] "integer"

Could you give me some advices? Thank you in advance.

clusterprofiler GO • 1.4k views
ADD COMMENT
1
Entering edit mode

Hi, you need to help us so that we can properly diagnose the problem. For example, you need to show what are the contents of up_gene.4vs1$GeneID. Also, how did you load or install Acan.OrgDb? Thank you.

ADD REPLY
0
Entering edit mode

Thank you. I have edited my question.

ADD REPLY
0
Entering edit mode

How is up_gene.4vs1$GeneID encoded? - as a factor?; are the IDs definitely Entrez IDs?

ADD REPLY
0
Entering edit mode

Thanks for your comment. The type is integer and I checked some GeneID directly by this page (Protein Table for Acanthamoeba castellanii str. Neff). It seems no problem here.

https://www.ncbi.nlm.nih.gov/genome/browse/#!/proteins/278/28653%7CAcanthamoeba%20castellanii%20str.%20Neff/14917167

ADD REPLY
0
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 698 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6