I'm totally new working with Bioconductor and hope you can help me with my following problem.
My data is a "Large DataFrame" including different DNA sequences (seq) in every row. These are from the "DNAStringSet" class from "Biostring"-package. One variable (pos) contains the information of the position of the first nucleobase. The goal is to filter out one nucleobase at one specific position. This position is not included in every row and each row does not start at the same position. So the distance between the starting position of the row and the position I'm looking for is varying. The position information is as well stored in the @ranges, which is from the class "GroupedIRanges" of the "XVector"-package. So I tryed using the subseq-function:
data$subseq_test <- subseq(data$seq@ranges, start = 6, end = 6) # not working Fehler in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘subseq’ for signature ‘"GroupedIRanges"’
data$subseq_test <- subseq(data$seq, start = 6, end = 6) # working, but not the way I want. It gives me the 6th nucleobase of every row counting from 1 from the beginning
As I read in another post, @ranges should not be used. My question is, how I can get this one position?
Here you can see some information of the data:
And my sessioninfo:
> sessionInfo() R version 3.3.3 (2017-03-06) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X Yosemite 10.10.5 locale:  de_DE.UTF-8/de_DE.UTF-8/de_DE.UTF-8/C/de_DE.UTF-8/de_DE.UTF-8 attached base packages:  parallel stats4 stats graphics grDevices utils datasets methods base other attached packages:  Biostrings_2.42.1 XVector_0.14.1 IRanges_2.8.2 S4Vectors_0.12.2 BiocGenerics_0.20.0 BiocInstaller_1.24.0