Question: Identify in tandem repeats with Bioconductor
gravatar for Vinicius Henrique da Silva
2.2 years ago by

I would like to identify the regions with repeated patterns in a given genome. Let's say that I need to identify [TA]n regions, were 'n' is a variable number of repeats.

I thought in a loop to resolve the problem, however, it will take a long time and will produce redundant regions. Thus, I would like to know if there is a efficient way to analyze that.

G = readDNAStringSet("any.fa")

seqAll <- seq(from =1 , to =1000, by=1) 
ali <- NULL

for(k in 1:length(seqAll)){
nx <- seqAll[k]

patx <- paste(rep("AT",nx), sep="", collapse="")

ali[k] <- vmatchPattern(DNAString(patx), G, max.mismatch=0)
ADD COMMENTlink modified 2.1 years ago by Hervé Pagès ♦♦ 13k • written 2.2 years ago by Vinicius Henrique da Silva20
gravatar for Hervé Pagès
2.1 years ago by
Hervé Pagès ♦♦ 13k
United States
Hervé Pagès ♦♦ 13k wrote:

Hi Vinicius,

You might want to check this post for a more efficient approach:

    A: Is there any package helps finding Tandem Repeats ?



ADD COMMENTlink written 2.1 years ago by Hervé Pagès ♦♦ 13k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 361 users visited in the last hour