error in forgeBSgenomeDataPkg
1
0
Entering edit mode
@jiazhou0116-24087
Last seen 10 months ago

I am trying to forge a BSgenome according to the instruction https://www.bioconductor.org/packages/devel/bioc/vignettes/BSgenome/inst/doc/BSgenomeForge.pdf. But I have error in the following command:

> forgeBSgenomeDataPkg('/Users/jiazhou/Box/methylation_analysis/msgbsR/BSgenome.Rhinella.marina/caneToad_seed')
Error in .readSeedFile(x, verbose = verbose) : 
  seed file '/Users/jiazhou/Box/methylation_analysis/msgbsR/BSgenome.Rhinella.marina/caneToad_seed' must have exactly 1 record

I write the information for the seed file in txt format and used write.dcf function to generate dcf format.

> CaneToad_seed <- read.delim("caneToad_seed")
> write.dcf(CaneToad_seed, file = "CaneToad_seed", append = FALSE, useBytes = FALSE,
          indent = 0.1 * getOption("width"),
          width = 0.9 * getOption("width"),
          keep.white = NULL)
> CaneToad_seed <- read.dcf("CaneToad_seed", all = TRUE)
> CaneToad_seed
   Description.Full.genome.sequences.for.Rhinella.marina..cane.toad..as.provided.by.UNSW..RM170330..
1                                                                           organism:Rhinella marina
2  Description.Full.genome.sequences.for.Rhinella.marina..cane.toad..as.provided.by.UNSW..RM170330.:
3                                                                              common_name:Cane toad
4  Description.Full.genome.sequences.for.Rhinella.marina..cane.toad..as.provided.by.UNSW..RM170330.:
5                           provider:UNSW\u2028provider_version:RM170330\u2028release_date:Mar. 2018
6  Description.Full.genome.sequences.for.Rhinella.marina..cane.toad..as.provided.by.UNSW..RM170330.:
7                                                         release_name:Rhinella marina (marine toad)
8  Description.Full.genome.sequences.for.Rhinella.marina..cane.toad..as.provided.by.UNSW..RM170330.:
9                                  source_url:https://www.ncbi.nlm.nih.gov/assembly/GCA_900303285.1/
10 Description.Full.genome.sequences.for.Rhinella.marina..cane.toad..as.provided.by.UNSW..RM170330.:
11                                                                 organism_biocview:Rhinella marina
12 Description.Full.genome.sequences.for.Rhinella.marina..cane.toad..as.provided.by.UNSW..RM170330.:
13                                                                  BSgenomeObjname: Rhinella marina
14 Description.Full.genome.sequences.for.Rhinella.marina..cane.toad..as.provided.by.UNSW..RM170330.:
15                     SrcDataFiles:.fna from https://www.ncbi.nlm.nih.gov/assembly/GCA_900303285.1/
16 Description.Full.genome.sequences.for.Rhinella.marina..cane.toad..as.provided.by.UNSW..RM170330.:
17          seqs_srcdir:/Users/jiazhou/Box/methylation_analysis/CaneToadRef/ncbi-genomes-2020-03-16/
18 Description.Full.genome.sequences.for.Rhinella.marina..cane.toad..as.provided.by.UNSW..RM170330.:
19                                                 seqfile_name:GCA_900303285.1_RM170330_genomic.fna

Just wondering whether you have any suggestions on this issue.

Thanks,

Jia

software error • 259 views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 2 days ago
United States

Your seed file looks borked. You might just take one of the example files and edit to fit. When you read in using read.dcf it should look something like

> seed_files <- system.file("extdata", "GentlemanLab", package="BSgenome")
> musFur1_seed <- list.files(seed_files, pattern="\\.musFur1-seed$", full.names=TRUE)
> read.dcf(musFur1_seed)
     Package                      
[1,] "BSgenome.Mfuro.UCSC.musFur1"
     Title                                                                   
[1,] "Full genome sequences for Mustela putorius furo (UCSC version musFur1)"
     Description                                                                                                                          
[1,] "Full genome sequences for Mustela putorius furo (Ferret) as provided by UCSC (musFur1, Apr. 2011) and stored in Biostrings objects."
     Version organism                common_name provider provider_version
[1,] "1.4.2" "Mustela putorius furo" "Ferret"    "UCSC"   "musFur1"       
     release_date release_name                                      
[1,] "Apr. 2011"  "Ferret Genome Sequencing Consortium MusPutFur1.0"
     source_url                                                  
[1,] "http://hgdownload.soe.ucsc.edu/goldenPath/musFur1/bigZips/"
     organism_biocview BSgenomeObjname
[1,] "Mustela_furo"    "Mfuro"        
     SrcDataFiles                                                                  
[1,] "musFur1.2bit from http://hgdownload.soe.ucsc.edu/goldenPath/musFur1/bigZips/"
     PkgExamples                                        
[1,] "genome$GL896898  # same as genome[[\"GL896898\"]]"
     seqs_srcdir                                                                    
[1,] "/fh/fast/morgan_m/BioC/BSgenomeForge/srcdata/BSgenome.Mfuro.UCSC.musFur1/seqs"
     seqfile_name  
[1,] "musFur1.2bit"
ADD COMMENT
0
Entering edit mode

Hi James,

Thanks for the suggestion! Really helpful.

Jia

ADD REPLY

Login before adding your answer.

Traffic: 307 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6