Using DNAStrings with Rcpp without converting to Character
Entering edit mode
James • 0
Last seen 7 months ago
United Kingdom

Hi there!

I'm writing some code to extract the Accumulated Natural Vectors from all the sequences in a DNAStringSet object. To speed things up I've written the code in C++ using Rcpp, and it works as long as I convert each of the DNAString objects to character vectors first.

For larger sequences this conversion is a bottleneck, and I was wondering if I can avoid it and pass the DNAString object directly. However, I can't find any documentation for passing a DNAString (or more generally, a BString object) to C++ with Rcpp - is there a best practice way of doing this?

All the best!

Biostrings DNAString Rcpp • 244 views
Entering edit mode
Last seen 12 hours ago
Seattle, WA, United States

Hi James,

Sorry for missing this. Do you still need to do this?

The standard way to access the string data of an XStringSet object at the C level is to use the "XStringSet_holder interface". This is not documented sorry. Note that for DNAStringSet and RNAStringSet objects the string data is encoded, which can make things a little complicated. A few Bioconductor packages have figured out how to do this. See for example the XStringSet2ByteStringVec function in the kebabs package here.

Other Bioconductor packages using the "XStringSet_holder interface": ShortRead, VariantFiltering, and DECIPHER.

Let me know if you need further help for this.




Login before adding your answer.

Traffic: 574 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6