Entering edit mode
Hello everyone, I'm trying to create a custom annotation for Locusta migratoria and then run an enrichment analysis with clusterProfiler. However, I met an error when running AnnotationForge::makeorgPackage()
, which returns the following error:
Populating genes table:
genes table filled
Populating gene_info table:
gene_info table filled
Populating go table:
go table filled
table metadata filled
Error in makeOrgDbFromDataFrames(data, tax_id, genus, species, dbFileName, :
'goTable' GO Ids must be formatted like 'GO:XXXXXXX'
My code is :
AnnotationForge::makeOrgPackage(gene_info=gene_info,
go=gene2go,
maintainer='liXY <XYZ_zz@163.com>',
author='liXY',
outputDir="E:/reKEGG/LocalOrgdb/20220416",
tax_id=7004,
genus='Locusta',
species='migratoria',
goTable="go",
version="1.0")
2 files I used:
> head(gene2go)
GID GO EVIDENCE
1 LOCMI17615 GO:0000003 IEA
2 LOCMI17615 GO:0003674 IEA
3 LOCMI17615 GO:0003824 IEA
4 LOCMI17615 GO:0005575 IEA
5 LOCMI17615 GO:0005622 IEA
6 LOCMI17615 GO:0005623 IEA
> head(gene_info)
GID Gene_Name
1 LOCMI17615 Methyltransf_11,Methyltransf_12,Methyltransf_23,Methyltransf_25,Methyltransf_31
2 LOCMI17599 Lig_chan,Lig_chan-Glu_bd
3 LOCMI02434 Methyltransf_11,Methyltransf_12,Methyltransf_23,Methyltransf_25,Methyltransf_31
4 LOCMI05868 N-SET,RRM_1,SET,SET_assoc
5 LOCMI15917 UCR_hinge
6 LOCMI08045 EZH2_WD-Binding,SET
I don't understand why this error occurs, because my GO id matches the prompt "GO: XXXXXXXX"
Not an answer to your question, but since you indicated that your ultimate goal is to use the
OrgDb
with clusterProfiler it is good to know that 'self-made'OrgDb
s are (apparently) not (yet?) compatible with clusterProfiler. This is due to lack of theGOALL
column in theOrgDb
s. See here: Use of clusterProfiler : Error in testForValidKeytype(x, keytype) [Also note that you may want to check whether a 'prefab'OrgDb
is available through theAnnotationHub
!]Having said this, making use of your GO-to-gene mapping is certainly possible with clusterProfiler, but then you will need to make use of the generic
enricher()
function and the argumentTERM2GENE
. See for example here: clusterProfiler-GO enrichment Error (option 2).Hi Guido, I have used clusterProfiler and my custom OrgDb to complete GO enrichment analysis, and due to the recent update of the EGGNOG database (to V2), I would like to rebuild a new Orgdb for this species. The former OrgDb I built has a
GOALL
column.The code used this time is still the same as before, but I don't know why the above error appears. Regarding your mention of the argument
TERM2GENE
, I used the following code to finish the GO and KEGG enrichment analysis. Plots withbarplot()
.Have you checked the entire column of GO ids?
Hi, I checked the entire column with your code. Here is the result:
That means you have GO terms that don't start with GO: Which is why you get the error.
After I got the annotation file from EGGNOG, I used the following code to extract the columns of GO annotations and then split the columns by using
separate_rows()
to get the previously mentionedgene2go
Please help me to see if there is something wrong with this way of handling it? Not getting GO terms that start with GO:
filter(!is.na(GO))
will not filter out something like row 8 of emapper that is not NA as the "-" is treated as a value. Filter out any GO that does not start with GOThank you so much. “-” is the reason for the error. I didn't double-check the filtered data, which was obviously a stupid mistake. (X﹏X) This problem has been bothering me for days. Thank you again.