I am trying to retrieve various reference genomes of viruses held in Genbank, or more specifically NCBI's RefSeq database.
I am exploring the new package biomartr 0.7.0 - which does just this - i.e. retrieves reference genomes
whilst this seems to work well for most taxons - it is not performing for viruses
here is code in which I compared a bacterial pathogen (Mycobacterium tuberculosis) with a virus (rabies virus)
is.genome.available(organism = "Mycobacterium tuberculosis", db = "refseq")
is.genome.available(organism = "Rabies lyssavirus", db = "refseq")
both of these came back with the response "TRUE"
However, when I try fetching these two reference sequences using the following code:
Mtb.genome.refseq <- getGenome(db = "refseq", organism = "Mycobacterium tuberculosis", path = file.path("_ncbi_downloads","genomes")) RABV.genome.refseq <- getGenome(db = "refseq", organism = "Rabies lyssavirus", path = file.path("_ncbi_downloads","genomes"))
The M. tunberculosis fetch is successful, but the rabies virus fetch fails with the following message:
----------> No reference genome or representative genome was found for 'Rabies lyssavirus'. Thus, download for this species has been omitted. Have you tried to specify 'reference = FALSE' ?
adding the argument reference = FALSE does not solve the problem.
PS. I am running the program in RStudio 1.1.447, R 3.5.0, Windows 10.