Question: convert a sequence to Ranges object
0
gravatar for Assa Yeroslaviz
2.2 years ago by
Assa Yeroslaviz1.4k
Munich, Germany
Assa Yeroslaviz1.4k wrote:

Hi,

is there a way to convert a sequence (in my case a fastA character vector) into a IRanges object based on a numeric vector? the vector contains the positions of a specific pattern in the fastA sequence.

> myseq
"MKLSVNEAQLGFPESLKTGQMMDESDEDFKELCASFFQRVKKHGIKEVSGE"
> Positions <- words.pos("K", myseq)
 [1]  2 17 30 41 42 46

 

I would like to convert the sequence into a IRanges object were the positions of the pattern give me the end positions of each range in the list. the start position should be one bigger than the last end position

it should be something like that:

IRanges object with 90 ranges and 0 metadata columns:
           start       end     width
       <integer> <integer> <integer>
   [1]         1         2         2
   [2]         3        17        15
   [3]        18        30        13 ...

What I have until now is this:

> Start <- c(1, Positions+1)
> End <- c(Positions, nchar(myseq))
> myRanges <- IRanges(start = Start, end = End)

Is there a more efficient method to do it? 

I also have the constrain here, that I take the positions as the end position, But what if i want to have it at the beginning pf my pattern and not the end?

thanks for any advices

Assa

 

 

iranges fasta split • 500 views
ADD COMMENTlink modified 2.2 years ago by Michael Lawrence10k • written 2.2 years ago by Assa Yeroslaviz1.4k
Answer: convert a sequence to Ranges object
1
gravatar for Michael Lawrence
2.2 years ago by
United States
Michael Lawrence10k wrote:
PartitioningByEnd(c(Positions, nchar(myseq)))
ADD COMMENTlink written 2.2 years ago by Michael Lawrence10k

this case covers my problem, if the pattern i am looking for is at the end of the sub-sequences, as in the case above. But what if I would like to have the pattern as the beginning of my sub-sequences? (here I can probably do Positions -1) or if I have two different amino-acids I am looking for (like "K" and "R"), and would like to cut the sequence before "K", but after "R" etc.

I know it sounds very complicated, but is there a more flexible way of looking for a specific pattern and decide how to handle it based on the pattern(s) I am looking for?

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by Assa Yeroslaviz1.4k
1

It depends on the specific case. In special cases, just do the math directly and pass the endpoints to the IRanges constructor.

ADD REPLYlink written 2.2 years ago by Michael Lawrence10k

Thanks, that what  I was doing, but this is sometimes not so straightforward. 

ADD REPLYlink written 2.2 years ago by Assa Yeroslaviz1.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 281 users visited in the last hour