The implementation of the set operations for XStringSet objects is a relic from prehistoric times. A better (and more generic) implementation is:
setMethod("union", c("Vector", "Vector"), function(x, y) unique(c(x, y)) ) setMethod("intersect", c("Vector", "Vector"), function(x, y) unique(x[x %in% y]) ) setMethod("setdiff", c("Vector", "Vector"), function(x, y) unique(x[!(x %in% y)]) )
They don't coerce to character vector internally (so are more efficient) and they propagate the names and metadata columns of the first argument (
Note that right now if you define the above methods (by copy/past'ing the above code in your session), the more specific methods for XStringSet objects will get in the way, that is, dispatch will still get the methods for XStringSet objects. So for now, to work around this, you would need to replace the occurrences of Vector with XStringSet. I'm in the process of adding the above methods to the S4Vectors package (where they belong) and removing the old methods for XStringSet objects from the Biostrings package. I'll let you know when I'm done.