Entering edit mode
Hello everyone, I'm trying to create a custom annotation for Locusta migratoria and then run an enrichment analysis with clusterProfiler. However, I met an error when running AnnotationForge::makeorgPackage()
, which returns the following error:
Populating genes table:
genes table filled
Populating gene_info table:
gene_info table filled
Populating go table:
go table filled
table metadata filled
Error in makeOrgDbFromDataFrames(data, tax_id, genus, species, dbFileName, :
'goTable' GO Ids must be formatted like 'GO:XXXXXXX'
My code is :
AnnotationForge::makeOrgPackage(gene_info=gene_info,
go=gene2go,
maintainer='liXY <XYZ_zz@163.com>',
author='liXY',
outputDir="E:/reKEGG/LocalOrgdb/20220416",
tax_id=7004,
genus='Locusta',
species='migratoria',
goTable="go",
version="1.0")
2 files I used:
> head(gene2go)
GID GO EVIDENCE
1 LOCMI17615 GO:0000003 IEA
2 LOCMI17615 GO:0003674 IEA
3 LOCMI17615 GO:0003824 IEA
4 LOCMI17615 GO:0005575 IEA
5 LOCMI17615 GO:0005622 IEA
6 LOCMI17615 GO:0005623 IEA
> head(gene_info)
GID Gene_Name
1 LOCMI17615 Methyltransf_11,Methyltransf_12,Methyltransf_23,Methyltransf_25,Methyltransf_31
2 LOCMI17599 Lig_chan,Lig_chan-Glu_bd
3 LOCMI02434 Methyltransf_11,Methyltransf_12,Methyltransf_23,Methyltransf_25,Methyltransf_31
4 LOCMI05868 N-SET,RRM_1,SET,SET_assoc
5 LOCMI15917 UCR_hinge
6 LOCMI08045 EZH2_WD-Binding,SET
I don't understand why this error occurs, because my GO id matches the prompt "GO: XXXXXXXX"
Have you checked the entire column of GO ids?
Hi, I checked the entire column with your code. Here is the result:
That means you have GO terms that don't start with GO: Which is why you get the error.
After I got the annotation file from EGGNOG, I used the following code to extract the columns of GO annotations and then split the columns by using
separate_rows()
to get the previously mentionedgene2go
Please help me to see if there is something wrong with this way of handling it? Not getting GO terms that start with GO:
filter(!is.na(GO))
will not filter out something like row 8 of emapper that is not NA as the "-" is treated as a value. Filter out any GO that does not start with GOThank you so much. “-” is the reason for the error. I didn't double-check the filtered data, which was obviously a stupid mistake. (X﹏X) This problem has been bothering me for days. Thank you again.
Not an answer to your question, but since you indicated that your ultimate goal is to use the
OrgDb
with clusterProfiler it is good to know that 'self-made'OrgDb
s are (apparently) not (yet?) compatible with clusterProfiler. This is due to lack of theGOALL
column in theOrgDb
s. See here: Use of clusterProfiler : Error in testForValidKeytype(x, keytype) [Also note that you may want to check whether a 'prefab'OrgDb
is available through theAnnotationHub
!]Having said this, making use of your GO-to-gene mapping is certainly possible with clusterProfiler, but then you will need to make use of the generic
enricher()
function and the argumentTERM2GENE
. See for example here: clusterProfiler-GO enrichment Error (option 2).Hi Guido, I have used clusterProfiler and my custom OrgDb to complete GO enrichment analysis, and due to the recent update of the EGGNOG database (to V2), I would like to rebuild a new Orgdb for this species. The former OrgDb I built has a
GOALL
column.The code used this time is still the same as before, but I don't know why the above error appears. Regarding your mention of the argument
TERM2GENE
, I used the following code to finish the GO and KEGG enrichment analysis. Plots withbarplot()
.