Help creating BSgenome object for new canine and feline genomes?
Kate
Last seen 4 days ago
United States


I would like to create a new BSgenome object for these two genome builds:

  1. Canine genome: canfam4, also called UU_Cfam_GSD_1.0
  2. Feline genome: fca126, also called F.catus_Fca126_mat1.0

Would someone be able to help me with this?

Best, Kate

BSgenome
shepherl
Last seen 3 days ago
United States

Have you looked at BSgenomeForge for creating new?

Thank you! I tried following the instructions for forgeBSgenomeDataPkgFromNCBI and ran into two issues - first, it seemed to time out without downloading the full fasta file.

    forgeBSgenomeDataPkgFromNCBI(assembly_accession="GCF_018350175.1",pkg_maintainer="myname <myemail>",destdir="./")

> Error in download.file(file_url, destfile, method, quiet) :   
> download from
> ''
> failed In addition: Warning messages: 1: In download.file(file_url,
> destfile, method, quiet) :   downloaded length 0 != reported length 0
> 2: In download.file(file_url, destfile, method, quiet) :   URL
> '':
> Timeout of 60 seconds was reached

To get around this, I instead downloaded the fasta myself using wget:


Which worked:

> --2024-02-22 20:17:15--
> Resolving (
>,, 2607:f220:41e:250::10, ... Connecting to (||:443...
> connected. HTTP request sent, awaiting response... 200 OK Length:
> 768361105 (733M) [application/x-gzip] Saving to:
> 'GCF_018350175.1_F.catus_Fca126_mat1.0_genomic.fna.gz'
> GCF_018350175.1_F.catus_Fca126_mat1.0_genomic.fna.gz  
> 100%[==========================================================================================================================>] 732.77M  4.28MB/s    in 6m 38s  
> 2024-02-22 20:23:54 (1.84 MB/s) -
> 'GCF_018350175.1_F.catus_Fca126_mat1.0_genomic.fna.gz' saved
> [768361105/768361105]

Then I retried the initial command:

    forgeBSgenomeDataPkgFromNCBI(assembly_accession="GCF_018350175.1",pkg_maintainer="myname <myemail>",destdir="./")

Now it creates the package directory, but is not able to copy the "single_sequences.2bit" file, as it can't find it:

> Creating package in ./BSgenome.Fcatus.NCBI.F.catusFca126mat1.0 
> existing ./BSgenome.Fcatus.NCBI.F.catusFca126mat1.0 was removed.
> Warning message: In file.rename(filepath, to) :   cannot rename file
> '/local/scratch/42331553.1.interactive/Rtmpy75ESJ/single_sequences.2bit'
> to
> './BSgenome.Fcatus.NCBI.F.catusFca126mat1.0/inst/extdata/single_sequences.2bit',
> reason 'No such file or directory'

Any suggestions on how to fix this?

Best, Kate

You can adjust the default time out with options. example options(timeout=10000)

Thank you, that was helpful! I was able to create and install the feline genome package. However, the tool that I am using to process my data is in R version 3.5, and I cannot seem to load the BSgenome library created with a newer version of R.

Error in library("BSgenome.Fcatus.NCBI.F.catusFca126mat1.0") :
  there is no package called 'BSgenome.Fcatus.NCBI.F.catusFca126mat1.0'
> library(devtools)
> load_all("BSgenome.Fcatus.NCBI.F.catusFca126mat1.0")
Loading BSgenome.Fcatus.NCBI.F.catusFca126mat1.0
Error in BSgenome(organism = "Felis catus", common_name = NA, genome = "F.catus_Fca126_mat1.0",  :
  unused argument (genome = "F.catus_Fca126_mat1.0")
In addition: Warning messages:
1: In (function (dep_name, dep_ver = NA, dep_compare = NA)  :
  Need GenomeInfoDb >= 1.34.9 but loaded version is 1.16.0
2: In (function (dep_name, dep_ver = NA, dep_compare = NA)  :
  Need BSgenome >= 1.66.1 but loaded version is 1.48.0

I tried re-installing the package in R 3.5, but got:

withr::with_libpaths("./", install_local("BSgenome.Fcatus.NCBI.F.catusFca126mat1.0", force=TRUE))
ERROR: this R is version 3.5.0, package  'BSgenome.Fcatus.NCBI.F.catusFca126mat1.0' requires R >= 4.2.0

I thought that maybe I could remake the BSgenome package in R 3.5, but when I try to install BSgenomeForge, I get:

package 'BSgenomeForge' is not available (for R version 3.5.0)

Is there a workaround for this?

Thank you again!

Best, Kate

The R version you are using is six(!) years old. You should upgrade to the current versions of R and Bioconductor first.

Last seen 2 days ago
Toulouse, France

Here an (unrendered, sorry) .qmd file documenting how I did it some times ago.

I wrote it mostly for myself if I need to do it again, but it might be helpful?


