Error in XVector while creating a custome BSgenome package
1
1
Entering edit mode
ROka ▴ 10
@roka-11670
Last seen 7.6 years ago

Hi, I am trying to create a custom BSgenome package for the maize genome.  I am keep on getting an error in XVector, which I do not know how to solve. 

Here is what I do and what I get:

>library(BSgenome)
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, cbind, colnames, do.call,
    duplicated, eval, evalq, Filter, Find, get, grep, grepl, intersect,
    is.unsorted, lapply, lengths, Map, mapply, match, mget, order,
    paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind,
    Reduce, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: ‘S4Vectors’

The following objects are masked from ‘package:base’:

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
Loading required package: Biostrings
Loading required package: XVector
Loading required package: rtracklayer

> forgeBSgenomeDataPkg("seed_Zea_mays.AGPv4.txt")
Creating package in ./BSgenome.Zmays.EnsemblPlants.AGPv4r32
Loading '1' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.1.fa' ... DONE
Loading '2' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.2.fa' ... DONE
Loading '3' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.3.fa' ... DONE
Loading '4' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.4.fa' ... DONE
Loading '5' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.5.fa' ... DONE
Loading '6' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.6.fa' ... DONE
Loading '7' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.7.fa' ... DONE
Loading '8' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.8.fa' ... DONE
Loading '9' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.9.fa' ... DONE
Loading '10' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.10.fa' ... DONE
Loading '1' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.1.fa' ... DONE
Loading '2' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.2.fa' ... DONE
Loading '3' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.3.fa' ... DONE
Loading '4' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.4.fa' ... DONE
Loading '5' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.5.fa' ... DONE
Loading '6' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.6.fa' ... DONE
Loading '7' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.7.fa' ... DONE
Loading '8' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.8.fa' ... DONE
Loading '9' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.9.fa' ... DONE
Loading '10' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.10.fa' ... DONE
Error in XVector:::new_XVectorList_from_list_of_XVector(tmp_class, x) :
  all elements in 'x' must be DNAString objects

This is the seed file content:

Package: BSgenome.Zmays.EnsemblPlants.AGPv4r32
Title: Zea mays (EnsemblPlants AGPv4 release 32)
Description: Zea mays full genome as provided by EnsemblPlants (AGPv4, release 32)
Version: 4.32
organism: Zea mays
common_name: maize
provider: EnsemblPlants
provider_version: 4.32
release_date: Aug. 2016
release_name: AGPv4
source_url: ftp://ftp.ensemblgenomes.org/pub/release-32/plants/fasta/zea_mays/dna/
organism_biocview: Zea_mays
BSgenomeObjname: Zmays
seqs_srcdir: ~/Documents/ref/AGPv4
seqfiles_prefix: Zea_mays.AGPv4.dna.chromosome.
seqfiles_suffix: .fa
seqnames: paste(c(1:9, 10, paste(c(1:9, 10), sep="")), sep="")

The versions I use are:

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.3 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
[1] BSgenome_1.40.1      rtracklayer_1.32.2   Biostrings_2.40.2   
[4] XVector_0.12.1       GenomicRanges_1.24.3 GenomeInfoDb_1.8.7  
[7] IRanges_2.6.1        S4Vectors_0.10.3     BiocGenerics_0.18.0

loaded via a namespace (and not attached):
 [1] XML_3.98-1.4               Rsamtools_1.24.0          
 [3] bitops_1.0-6               GenomicAlignments_1.8.4   
 [5] zlibbioc_1.18.0            BiocParallel_1.6.6        
 [7] tools_3.3.1                Biobase_2.32.0            
 [9] RCurl_1.95-4.8             SummarizedExperiment_1.2.3


Does anyone know how to fix the problem? 

Thank you in advance! 

BSgenome • 1.1k views
ADD COMMENT
2
Entering edit mode
Mike Smith ★ 6.5k
@mike-smith
Last seen 3 hours ago
EMBL Heidelberg

I haven't looked into why this causes an error, but I think this is related to the fact that you're reading each of the sequence files twice.  Try changing the final line of your forge file to the example below and give it another try.

seqnames: paste0(1:10)
ADD COMMENT
0
Entering edit mode

I never could make it work with reading the sequence files once, but your solution worked!  Thank you! 

ADD REPLY

Login before adding your answer.

Traffic: 673 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6