Entering edit mode
                    Hello everyone, I'm trying to create a custom annotation for Locusta migratoria and then run an enrichment analysis with clusterProfiler. However, I met an error when running AnnotationForge::makeorgPackage(), which returns the following error:
Populating genes table:
genes table filled
Populating gene_info table:
gene_info table filled
Populating go table:
go table filled
table metadata filled
Error in makeOrgDbFromDataFrames(data, tax_id, genus, species, dbFileName,  : 
  'goTable' GO Ids must be formatted like 'GO:XXXXXXX'
My code is :
AnnotationForge::makeOrgPackage(gene_info=gene_info,
                                go=gene2go,
                                maintainer='liXY <XYZ_zz@163.com>',
                                author='liXY',
                                outputDir="E:/reKEGG/LocalOrgdb/20220416",
                                tax_id=7004,
                                genus='Locusta',
                                species='migratoria',
                                goTable="go",
                                version="1.0")
2 files I used:
> head(gene2go)
         GID         GO EVIDENCE
1 LOCMI17615 GO:0000003      IEA
2 LOCMI17615 GO:0003674      IEA
3 LOCMI17615 GO:0003824      IEA
4 LOCMI17615 GO:0005575      IEA
5 LOCMI17615 GO:0005622      IEA
6 LOCMI17615 GO:0005623      IEA
> head(gene_info)
         GID                                                                       Gene_Name
1 LOCMI17615 Methyltransf_11,Methyltransf_12,Methyltransf_23,Methyltransf_25,Methyltransf_31
2 LOCMI17599                                                        Lig_chan,Lig_chan-Glu_bd
3 LOCMI02434 Methyltransf_11,Methyltransf_12,Methyltransf_23,Methyltransf_25,Methyltransf_31
4 LOCMI05868                                                       N-SET,RRM_1,SET,SET_assoc
5 LOCMI15917                                                                       UCR_hinge
6 LOCMI08045                                                             EZH2_WD-Binding,SET
I don't understand why this error occurs, because my GO id matches the prompt "GO: XXXXXXXX"

Have you checked the entire column of GO ids?
Hi, I checked the entire column with your code. Here is the result:
That means you have GO terms that don't start with GO: Which is why you get the error.
After I got the annotation file from EGGNOG, I used the following code to extract the columns of GO annotations and then split the columns by using
separate_rows()to get the previously mentionedgene2goPlease help me to see if there is something wrong with this way of handling it? Not getting GO terms that start with GO:
filter(!is.na(GO))will not filter out something like row 8 of emapper that is not NA as the "-" is treated as a value. Filter out any GO that does not start with GOThank you so much. “-” is the reason for the error. I didn't double-check the filtered data, which was obviously a stupid mistake. (X﹏X) This problem has been bothering me for days. Thank you again.
Not an answer to your question, but since you indicated that your ultimate goal is to use the
OrgDbwith clusterProfiler it is good to know that 'self-made'OrgDbs are (apparently) not (yet?) compatible with clusterProfiler. This is due to lack of theGOALLcolumn in theOrgDbs. See here: Use of clusterProfiler : Error in testForValidKeytype(x, keytype) [Also note that you may want to check whether a 'prefab'OrgDbis available through theAnnotationHub!]Having said this, making use of your GO-to-gene mapping is certainly possible with clusterProfiler, but then you will need to make use of the generic
enricher()function and the argumentTERM2GENE. See for example here: clusterProfiler-GO enrichment Error (option 2).Hi Guido, I have used clusterProfiler and my custom OrgDb to complete GO enrichment analysis, and due to the recent update of the EGGNOG database (to V2), I would like to rebuild a new Orgdb for this species. The former OrgDb I built has a
GOALLcolumn.The code used this time is still the same as before, but I don't know why the above error appears. Regarding your mention of the argument
TERM2GENE, I used the following code to finish the GO and KEGG enrichment analysis. Plots withbarplot().