Using DNAStrings with Rcpp without converting to Character
1
0
Entering edit mode
James • 0
@1cecd82d
Last seen 2.0 years ago
United Kingdom

Hi there!

I'm writing some code to extract the Accumulated Natural Vectors from all the sequences in a DNAStringSet object. To speed things up I've written the code in C++ using Rcpp, and it works as long as I convert each of the DNAString objects to character vectors first.

For larger sequences this conversion is a bottleneck, and I was wondering if I can avoid it and pass the DNAString object directly. However, I can't find any documentation for passing a DNAString (or more generally, a BString object) to C++ with Rcpp - is there a best practice way of doing this?

All the best!

Biostrings DNAString Rcpp • 725 views
ADD COMMENT
0
Entering edit mode
@herve-pages-1542
Last seen 6 hours ago
Seattle, WA, United States

Hi James,

Sorry for missing this. Do you still need to do this?

The standard way to access the string data of an XStringSet object at the C level is to use the "XStringSet_holder interface". This is not documented sorry. Note that for DNAStringSet and RNAStringSet objects the string data is encoded, which can make things a little complicated. A few Bioconductor packages have figured out how to do this. See for example the XStringSet2ByteStringVec function in the kebabs package here.

Other Bioconductor packages using the "XStringSet_holder interface": ShortRead, VariantFiltering, and DECIPHER.

Let me know if you need further help for this.

Best,

H.

ADD COMMENT

Login before adding your answer.

Traffic: 601 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6