Importing genbank entries for synthetic constructs
1
1
Entering edit mode
@thomas-sandmann-6817
Last seen 18 months ago
USA

Dear Gabe,

thanks a lot for making the genbankr package available. Today, I tried to parse a genbank entry for a synthetic DNA molecule, e.g. KR709867.1.

Importing this file by accession failed:

id = GBAccession("KR709867.1")

readGenBank(id)
Error in .normargIsCircular(isCircular, seqnames) : 
  length of supplied 'isCircular' must equal the number of sequences

I traced the error to the make_gbrecord function, which raises the error in the following line:

sqinfo = Seqinfo(seqlevels(srcs), width(srcs), circ, genom)

because the srcs GRanges object contains 2 ranges:

Ranges object with 2 ranges and 9 metadata columns: seqnames ranges strand | type organism <Rle> <IRanges> <Rle> | <character> <character> [1] synthetic construct:1 [ 1, 1311] + | source synthetic construct [2] Homo sapiens:2 [66, 1244] + | source Homo sapiens mol_type db_xref clone focus <character> <CharacterList> <character> <logical> [1] other DNA taxon:32630 CCSBHm_00007040 TRUE [2] other DNA taxon:9606 <NA> FALSE note loctype <character> <character> [1] vector:pDONR223; derived from parent clone GenBankaccession: KJ897694 normal [2] <NA> normal temp_grouping_id <integer> [1] 1 [2] 2 ------- seqinfo: 2 sequences from an unspecified genome; no seqlengths

Are you intending the genbankr package to support synthetic constructs (plasmids, clones, etc)? If so, maybe you want to take a look at this example.

Thanks,

Thomas

> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Sierra 10.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] genbankr_1.2.0       BiocInstaller_1.24.0

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.8                AnnotationDbi_1.36.0       XVector_0.14.0            
 [4] GenomicAlignments_1.10.0   GenomicRanges_1.26.1       BiocGenerics_0.20.0       
 [7] zlibbioc_1.20.0            IRanges_2.8.1              BiocParallel_1.8.1        
[10] BSgenome_1.42.0            lattice_0.20-34            R6_2.2.0                  
[13] httr_1.2.1                 rentrez_1.0.4              GenomeInfoDb_1.10.1       
[16] tools_3.3.2                grid_3.3.2                 SummarizedExperiment_1.4.0
[19] parallel_3.3.2             Biobase_2.34.0             DBI_0.5-1                 
[22] digest_0.6.10              Matrix_1.2-7.1             rtracklayer_1.34.1        
[25] S4Vectors_0.12.1           bitops_1.0-6               curl_2.3                  
[28] RCurl_1.95-4.8             biomaRt_2.30.0             memoise_1.0.0             
[31] RSQLite_1.1                GenomicFeatures_1.26.0     Biostrings_2.42.1         
[34] Rsamtools_1.26.1           stats4_3.3.2               XML_3.98-1.5              
[37] jsonlite_1.1               VariantAnnotation_1.20.2  

 

genbankr • 1.6k views
ADD COMMENT
1
Entering edit mode
@beckergabriel-9990
Last seen 6.6 years ago

Thomas,

Thanks for the report. Ideally, genbankr is intended to support anything (reasonable) provided in the GenBank (or GenPept) format. I will look into this and should be able to get it fixed. 

I will comment again when there is news.

Best,

~G

ADD COMMENT

Login before adding your answer.

Traffic: 948 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6