Question: errors for goseq makeOrgPackageFromNCBI in AnnotationForge
0
3.6 years ago by
United States
songeric11070 wrote:

I am trying to make db for Mycoplasma genus = "Mycoplasma", species = "hyorhinis", but I get error, could you please help.

makeOrgPackageFromNCBI(version = "0.0.1", author = "me", maintainer = "me <me@mine.org>", outputDir = ".", tax_id = "1129369", genus = "Mycoplasma", species = "hyorhinis", NCBIFilesDir = ".")

cache will try to rebuild once per day.

Getting data for gene2pubmed.gz

extracting only data for our organism from : gene2pubmed

Getting data for gene2accession.gz

extracting only data for our organism from : gene2accession

Getting data for gene2refseq.gz

extracting only data for our organism from : gene2refseq

Getting data for gene_info.gz

extracting only data for our organism from : gene_info

Getting data for gene2go.gz

extracting only data for our organism from : gene2go

Error in unique(rbind(as.matrix(aliasDat), as.matrix(symbolDat))) :

error in evaluating the argument 'x' in selecting a method for function 'unique': Error in .Method(..., deparse.level = deparse.level) :

number of columns of matrices must match (see arg 2)

go • 649 views
modified 3.5 years ago by James W. MacDonald50k • written 3.6 years ago by songeric11070
Answer: errors for goseq makeOrgPackageFromNCBI in AnnotationForge
0
3.6 years ago by
United States
James W. MacDonald50k wrote:

Looks like there aren't any GO data for that organism:

zcat /data/tmp2/gene2go.gz | awk '{if($1 == 1129369) print$0}' | wc -l
0

Whereas something like Mus musculus is there in spades:

zcat /data/tmp2/gene2go.gz | awk '{if($1 == 10116) print$0}' | wc -l
321655

So you might need a different source for your GO data.

fyi, AnnotationForge has been modified (release and devel) to throw a more informative message:

...
Getting data for gene2go.gz
extracting only data for our organism from : gene2go
Error in prepareDataFromNCBI(tax_id, NCBIFilesDir, outputDir, rebuildCache) :
no information found for species with tax id 1129369

Valerie

Answer: errors for goseq makeOrgPackageFromNCBI in AnnotationForge
0
3.6 years ago by
United States
songeric11070 wrote:

Do you know how to link to David database, they have some similar species there.

There is DAVIDquery, but I have no personal experience with either DAVID nor DAVIDquery.

Answer: errors for goseq makeOrgPackageFromNCBI in AnnotationForge
0
3.5 years ago by
United States
songeric11070 wrote:

Could someone tell me how to make database genus = "Mycoplasma", species = "hyorhinis" , which can be used in goseq. thank you

Answer: errors for goseq makeOrgPackageFromNCBI in AnnotationForge
0
3.5 years ago by
United States
songeric11070 wrote:

Could someone tell me how to make database genus = "Mycoplasma", species = "hyorhinis" , which can be used in goseq. thank you

Answer: errors for goseq makeOrgPackageFromNCBI in AnnotationForge
0
3.5 years ago by
United States
James W. MacDonald50k wrote:

There is very limited information for this species, at least as far as I can tell. But I don't deal with prokaryotes much at all. EBI has some data at their UniProt site that you could download and parse into the correct format for goseq. But do note that how you would do that is well beyond anything you could expect to be answered at this support site. Parsing random text files is something you will probably need to learn, but that's not what this site is for. My preferred weapon for figuring out how to do things that I don't already know is Google, because it's 2015 and that's how we do these days.

The goseq vignette tells you what you need for unsupported organisms, and the link to UniProt that I supply will give you the starting format for the GO data, so you can see what you have to start with and what you need, and just need to figure how to get to there from here.