BStringSet not work with lists of many DNAString elements
1
0
Entering edit mode
heyi xiao ▴ 360
@heyi-xiao-3308
Last seen 8.2 years ago
United States
Dear all, I try used the Biostrings/BSgenome utilities to extract DNA sequences for Entrez genes. It worked fine till I am ready to output the extracted sequence to a fasta file. Because writeXStringSet is the only function for writing fasta files, which only works with an XStringSet object. I need to convert my list of DNAString objects into an XStringSet object. Unfortunately, the converter/constructor BStringSet only works with lists of a few DNAString elements. It produces error on larger lists as below. Not sure how to deal with the issue. Thanks for any suggestions/inputs in advance! Heyi > exonSeq.set=BStringSet(exonSeq.list[1:30]) Error in .Call2("SharedVector_mcopy", dest, dest.offset, src, src.start, : subscript out of bounds > exonSeq.set=BStringSet(exonSeq.list[1:25]) > exonSeq.set=BStringSet(exonSeq.list[1:26]) Error in .Call2("SharedVector_mcopy", dest, dest.offset, src, src.start, : subscript out of bounds > exonSeq.set=BStringSet(exonSeq.list[26:30]) > exonSeq.set=BStringSet(exonSeq.list[26:40]) Error in .Call2("SharedVector_mcopy", dest, dest.offset, src, src.start, : subscript out of bounds > head(exonSeq.list,3) $`442993` 133057-letter "DNAString" instance seq: TGAGACGGCTTTTATTCCTGAGCTTCTGCTGCTCAC...AAAGCTGTCATCAATGAAAAAAGGTA AGAGAAAAAC $`442994` 23917-letter "DNAString" instance seq: CAGTTCTGACCCACTTCAAGGTTACATCTCCAAGGT...CTTACGATTTTTGCAGATAAAAAATT TATCTGCAAA $`442995` 21718-letter "DNAString" instance seq: GTCTTCTCTCCTTGCTGCTCTCAGGTAGGGGCTGGG...GGAAGAAGCAGAATAAAGCAATTTTC CTTGAAGTGA > sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] BSgenome.Oaries.NCBI.Oar3.1_1.0 Biobase_2.21.6 [3] BSgenome_1.29.0 Biostrings_2.29.14 [5] GenomicRanges_1.13.35 XVector_0.1.0 [7] IRanges_1.19.19 BiocGenerics_0.7.3 loaded via a namespace (and not attached): [1] stats4_3.0.1 tools_3.0.1
BSgenome convert BSgenome BSgenome convert BSgenome • 920 views
ADD COMMENT
0
Entering edit mode
@herve-pages-1542
Last seen 15 hours ago
Seattle, WA, United States
Hi, I cannot reproduce this. I've tried to call BStringSet() on a list of 100 DNAString objects of 25 million letters each and it worked. Can you please provide a self-contained reproducible example? Thanks, H. On 10/18/2013 10:46 AM, heyi xiao wrote: > Dear all, > I try used the Biostrings/BSgenome utilities to extract DNA sequences for Entrez genes. It worked fine till I am ready to output the extracted sequence to a fasta file. Because writeXStringSet is the only function for writing fasta files, which only works with an XStringSet object. I need to convert my list of DNAString objects into an XStringSet object. Unfortunately, the converter/constructor BStringSet only works with lists of a few DNAString elements. It produces error on larger lists as below. Not sure how to deal with the issue. Thanks for any suggestions/inputs in advance! > Heyi > >> exonSeq.set=BStringSet(exonSeq.list[1:30]) > Error in .Call2("SharedVector_mcopy", dest, dest.offset, src, src.start, : > subscript out of bounds >> exonSeq.set=BStringSet(exonSeq.list[1:25]) >> exonSeq.set=BStringSet(exonSeq.list[1:26]) > Error in .Call2("SharedVector_mcopy", dest, dest.offset, src, src.start, : > subscript out of bounds >> exonSeq.set=BStringSet(exonSeq.list[26:30]) >> exonSeq.set=BStringSet(exonSeq.list[26:40]) > Error in .Call2("SharedVector_mcopy", dest, dest.offset, src, src.start, : > subscript out of bounds > >> head(exonSeq.list,3) > $`442993` > 133057-letter "DNAString" instance > seq: TGAGACGGCTTTTATTCCTGAGCTTCTGCTGCTCAC...AAAGCTGTCATCAATGAAAAAAGG TAAGAGAAAAAC > > $`442994` > 23917-letter "DNAString" instance > seq: CAGTTCTGACCCACTTCAAGGTTACATCTCCAAGGT...CTTACGATTTTTGCAGATAAAAAA TTTATCTGCAAA > > $`442995` > 21718-letter "DNAString" instance > seq: GTCTTCTCTCCTTGCTGCTCTCAGGTAGGGGCTGGG...GGAAGAAGCAGAATAAAGCAATTT TCCTTGAAGTGA > >> sessionInfo() > R version 3.0.1 (2013-05-16) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] BSgenome.Oaries.NCBI.Oar3.1_1.0 Biobase_2.21.6 > [3] BSgenome_1.29.0 Biostrings_2.29.14 > [5] GenomicRanges_1.13.35 XVector_0.1.0 > [7] IRanges_1.19.19 BiocGenerics_0.7.3 > > loaded via a namespace (and not attached): > [1] stats4_3.0.1 tools_3.0.1 > > > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD COMMENT

Login before adding your answer.

Traffic: 654 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6