Question: errors for goseq makeOrgPackageFromNCBI in AnnotationForge
0
gravatar for songeric1107
3.3 years ago by
United States
songeric11070 wrote:

I am trying to make db for Mycoplasma genus = "Mycoplasma", species = "hyorhinis", but I get error, could you please help.

makeOrgPackageFromNCBI(version = "0.0.1", author = "me", maintainer = "me <me@mine.org>", outputDir = ".", tax_id = "1129369", genus = "Mycoplasma", species = "hyorhinis", NCBIFilesDir = ".")

 

cache will try to rebuild once per day.

Getting data for gene2pubmed.gz

extracting only data for our organism from : gene2pubmed

Getting data for gene2accession.gz

extracting only data for our organism from : gene2accession

Getting data for gene2refseq.gz

extracting only data for our organism from : gene2refseq

Getting data for gene_info.gz

extracting only data for our organism from : gene_info

Getting data for gene2go.gz

extracting only data for our organism from : gene2go

Error in unique(rbind(as.matrix(aliasDat), as.matrix(symbolDat))) : 

  error in evaluating the argument 'x' in selecting a method for function 'unique': Error in .Method(..., deparse.level = deparse.level) : 

  number of columns of matrices must match (see arg 2)

 

go • 610 views
ADD COMMENTlink modified 3.3 years ago by James W. MacDonald49k • written 3.3 years ago by songeric11070
Answer: errors for goseq makeOrgPackageFromNCBI in AnnotationForge
0
gravatar for James W. MacDonald
3.3 years ago by
United States
James W. MacDonald49k wrote:

Looks like there aren't any GO data for that organism:

zcat /data/tmp2/gene2go.gz | awk '{if($1 == 1129369) print $0}' | wc -l
0

Whereas something like Mus musculus is there in spades:

zcat /data/tmp2/gene2go.gz | awk '{if($1 == 10116) print $0}' | wc -l
321655

So you might need a different source for your GO data.

ADD COMMENTlink written 3.3 years ago by James W. MacDonald49k

fyi, AnnotationForge has been modified (release and devel) to throw a more informative message:

...
Getting data for gene2go.gz
extracting only data for our organism from : gene2go
Error in prepareDataFromNCBI(tax_id, NCBIFilesDir, outputDir, rebuildCache) : 
  no information found for species with tax id 1129369

Valerie

 

ADD REPLYlink written 3.3 years ago by Valerie Obenchain6.7k
Answer: errors for goseq makeOrgPackageFromNCBI in AnnotationForge
0
gravatar for songeric1107
3.3 years ago by
United States
songeric11070 wrote:

Do you know how to link to David database, they have some similar species there.

ADD COMMENTlink written 3.3 years ago by songeric11070

There is DAVIDquery, but I have no personal experience with either DAVID nor DAVIDquery.
 

ADD REPLYlink written 3.3 years ago by James W. MacDonald49k
Answer: errors for goseq makeOrgPackageFromNCBI in AnnotationForge
0
gravatar for songeric1107
3.3 years ago by
United States
songeric11070 wrote:

Could someone tell me how to make database genus = "Mycoplasma", species = "hyorhinis" , which can be used in goseq. thank you

 

 

 

ADD COMMENTlink written 3.3 years ago by songeric11070
Answer: errors for goseq makeOrgPackageFromNCBI in AnnotationForge
0
gravatar for songeric1107
3.3 years ago by
United States
songeric11070 wrote:

Could someone tell me how to make database genus = "Mycoplasma", species = "hyorhinis" , which can be used in goseq. thank you

 

 

 

ADD COMMENTlink written 3.3 years ago by songeric11070
Answer: errors for goseq makeOrgPackageFromNCBI in AnnotationForge
0
gravatar for James W. MacDonald
3.3 years ago by
United States
James W. MacDonald49k wrote:

There is very limited information for this species, at least as far as I can tell. But I don't deal with prokaryotes much at all. EBI has some data at their UniProt site that you could download and parse into the correct format for goseq. But do note that how you would do that is well beyond anything you could expect to be answered at this support site. Parsing random text files is something you will probably need to learn, but that's not what this site is for. My preferred weapon for figuring out how to do things that I don't already know is Google, because it's 2015 and that's how we do these days.

The goseq vignette tells you what you need for unsupported organisms, and the link to UniProt that I supply will give you the starting format for the GO data, so you can see what you have to start with and what you need, and just need to figure how to get to there from here.

ADD COMMENTlink written 3.3 years ago by James W. MacDonald49k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 472 users visited in the last hour