Search
Question: Ranges on AAStrings
0
18 months ago by
tobias.kockmann20 wrote:

Hi BioC,

can anyone help me with the following:

I mapped a set of peptides (short AA sequences) to a set of proteins using exact string matching. Now, I would like to compare the locations of the mapped peptide to each other, in order to find peptides that locate next to one another. To me this problem sounds like something one could tackle with ranges on AAStrings objects. So I created a AAString to preresent the protein of interest:

> poi
384-letter "AAString" instance
seq: MSSMQMDPELAKQLFFEGATVVILNMPKGTEFGIDYNSWEVGPKFR...AVEATLRKKAEKFQAHLTKKFRWDFTSEPEDCAPVVVELPEGIETA


and a set of Views to present the mapped peptides:

> v1
Views on a 384-letter AAString subject
subject: MSSMQMDPELAKQLFFEGATVVILNMPKGTEFGIDYNSWEVGPK...EATLRKKAEKFQAHLTKKFRWDFTSEPEDCAPVVVELPEGIETA
views:
start end width
[1]     1  10    10 [MSSMQMDPEL]
[2]    10  17     8 [LAKQLFFE]
[3]    25  29     5 [NMPKG]

Are there any BioC functions that can be used to analyse these Views? Like Finding views that follow each other at zero distance, or the next neighbour...I saw that such functions exist for ranged integers (IRanges):

pcompare(x,y)

findOverlaps(x,y)

etc.

But somehow these functions do not like Views on AAStrings:

> pcompare(v1[1], v1[-1])
Error in (function (classes, fdef, mtable)  :
unable to find an inherited method for function 'pcompare' for signature '"AAString", "AAString"'
> class(v1)
[1] "XStringViews"
attr(,"package")
[1] "Biostrings"

Is there a way to make this work?

Greetings,

Tobi

written 18 months ago by tobias.kockmann20
1

Is there a reason you can't work with these as ranges directly?  Presumably if you've done local alignment between peptides and proteins you've got a set of start and end points for the alignments already, which you're using to construct the XStringsViews object.  Might it be easier to simply construct an IRanges or IRangesList at this stage?  You can give each range a name so you can relate it back to the specific peptide if you need to.

1

I agree with Mike. Said otherwise, these range operations (pcompare, findOverlaps, etc...) don't work on Views objects but they do work on the ranges of the Views objects. You can extract the ranges of a Views object with e.g. ranges(v1), so, instead of pcompare(v1[1], v1[-1]), do pcompare(ranges(v1)[1], ranges(v1)[-1]).

H.

ADD REPLYlink written 18 months ago by Hervé Pagès ♦♦ 13k

I found two exceptions. findOverlaps() and countOverlaps() work directly on the Views object if queried against itself (within proteins for a AAString subject):

hits <- findOverlaps(query = v1, maxgap = 1, drop.self=TRUE, drop.redundant=TRUE)
c <- countOverlaps(query = v1, maxgap = 1, drop.self=TRUE, drop.redundant=TRUE)

> hits
SelfHits object with 1 hit and 0 metadata columns:
queryHits subjectHits
<integer>   <integer>
[1]         1           2
-------
queryLength: 3 / subjectLength: 3
ADD REPLYlink modified 18 months ago • written 18 months ago by tobias.kockmann20

True thought!