You can do something like this. First we'll create an example QualityScaledDNAStringSet using the example in the help page
library(Biostrings)
x1 <- DNAStringSet(c("TTGA", "CTCN"))
q1 <- PhredQuality(c("*+,-", "6789"))
qx1 <- QualityScaledDNAStringSet(x1, q1)
Then we'll create a list containing the integer versions of the quality scores. This list has one entry for each sequence in our StringSet
quals_list <- as(quality(qx1), "IntegerList")
> quals_list
IntegerList of length 2
[[1]] 9 10 11 12
[[2]] 21 22 23 24
Now we apply a function to each of these vectors of quality scores. In this case we are going to check if all of the entries in each vector are greater than 20, and return TRUE
or FALSE
. You could put whatever function you want in here based on your criteria of 'good quality'.
good_quality <- sapply(quals_list, FUN = function(x) {
return(all(x >= 20))
})
Finally we subset by this set of TRUE/FALSE values to keep only the good ones.
qx1_good <- qx1[good_quality]
> qx1_good
A QualityScaledDNAStringSet instance containing:
A DNAStringSet instance of length 1
width seq
[1] 4 CTCN
A PhredQuality instance of length 1
width seq
[1] 4 6789
It's much more efficient to work on the *List with single function calls