using and combining of "subseq"
1
0
Entering edit mode
@kristian-ullrich-4698
Last seen 10.2 years ago
Hello Biostrings curators, again the question to you: Is there an easier way to solve the follwing: R-code: #################### #################### library(Biostrings) #example sequence seq.list=list() seq.list[1]="AAAAAAAAAATTTTTTTTTTGGGGGGGGGGCCCCCCCCCC" seq.list[2]="TTTTTTTTTTGGGGGGGGGGCCCCCCCCCCAAAAAAAAAA" fas.seq = DNAStringSet(unlist(seq.list)) #defining start and end points of subseq start1 = 1 end1 = 10 start2 = 21 end2 = 25 #creating first and second subseq first.subseq = subseq(fas.seq,start1,end1) second.subseq = subseq(fas.seq,start2,end2) new.seq = DNAStringSet(apply(sapply(list(first.subseq,second.subseq),as.characte r),1,function(x) paste(x,collapse=""))) names(new.seq) = names(fas.seq) #################### #################### I basically want to combine subseqs from one DNAStringset, something like: subseq(DNAStringSet, start = c(start1,start2), end = c(end1,end2)) would be nice. Thank you in anticipation Kristian Ullrich -- Kristian Ullrich Leibniz Institute of Plant Biochemistry Weinberg 3 D-06120 Halle (Saale), Germany phone +49 345 5582 1221 fax +49 345 5582 1209 mail kullrich at ipb-halle.de
Biostrings Biostrings • 1.8k views
ADD COMMENT
0
Entering edit mode
@harris-a-jaffee-3972
Last seen 10.1 years ago
United States
x1 = "AAAAAAAAAATTTTTTTTTTGGGGGGGGGGCCCCCCCCCC" x2 = "TTTTTTTTTTGGGGGGGGGGCCCCCCCCCCAAAAAAAAAA" X = DNAStringSet(c(x1, x2)) > X A DNAStringSet instance of length 2 width seq [1] 40 AAAAAAAAAATTTTTTTTTTGGGGGGGGGGCCCCCCCCCC [2] 40 TTTTTTTTTTGGGGGGGGGGCCCCCCCCCCAAAAAAAAAA > start1 = 1 > end1 = 10 > > start2 = 21 > end2 = 25 s1 = subseq(X, start1, end1) s2 = subseq(X, start2, end2) answer = DNAStringSet(paste(s1, s2, sep="")) > answer A DNAStringSet instance of length 2 width seq [1] 15 AAAAAAAAAAGGGGG [2] 15 TTTTTTTTTTCCCCC On Jun 15, 2011, at 7:36 AM, Kristian Ullrich wrote: > Hello Biostrings curators, > > again the question to you: > > Is there an easier way to solve the follwing: > > R-code: > #################### > #################### > library(Biostrings) > > #example sequence > seq.list=list() > seq.list[1]="AAAAAAAAAATTTTTTTTTTGGGGGGGGGGCCCCCCCCCC" > seq.list[2]="TTTTTTTTTTGGGGGGGGGGCCCCCCCCCCAAAAAAAAAA" > fas.seq = DNAStringSet(unlist(seq.list)) > > #defining start and end points of subseq > start1 = 1 > end1 = 10 > > start2 = 21 > end2 = 25 > > #creating first and second subseq > first.subseq = subseq(fas.seq,start1,end1) > second.subseq = subseq(fas.seq,start2,end2) > > new.seq = DNAStringSet(apply(sapply(list > (first.subseq,second.subseq),as.character),1,function(x) paste > (x,collapse=""))) > names(new.seq) = names(fas.seq) > #################### > #################### > > I basically want to combine subseqs from one DNAStringset, > something like: > > subseq(DNAStringSet, start = c(start1,start2), end = c(end1,end2)) > > would be nice. > > Thank you in anticipation > > Kristian Ullrich > -- > Kristian Ullrich > > Leibniz Institute of Plant Biochemistry > Weinberg 3 > D-06120 Halle (Saale), Germany > phone +49 345 5582 1221 > fax +49 345 5582 1209 > mail kullrich at ipb-halle.de > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/ > gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
On 11-06-15 11:09 AM, Harris A. Jaffee wrote: > x1 = "AAAAAAAAAATTTTTTTTTTGGGGGGGGGGCCCCCCCCCC" > x2 = "TTTTTTTTTTGGGGGGGGGGCCCCCCCCCCAAAAAAAAAA" > X = DNAStringSet(c(x1, x2)) > > > X > A DNAStringSet instance of length 2 > width seq > [1] 40 AAAAAAAAAATTTTTTTTTTGGGGGGGGGGCCCCCCCCCC > [2] 40 TTTTTTTTTTGGGGGGGGGGCCCCCCCCCCAAAAAAAAAA > >> start1 = 1 >> end1 = 10 >> >> start2 = 21 >> end2 = 25 > > s1 = subseq(X, start1, end1) > s2 = subseq(X, start2, end2) > answer = DNAStringSet(paste(s1, s2, sep="")) Or 'answer = xscat(s1, s2)' would be more efficient here, especially if 's1' and 's2' contain hundreds of thousands of sequences. Cheers, H. > > > answer > A DNAStringSet instance of length 2 > width seq > [1] 15 AAAAAAAAAAGGGGG > [2] 15 TTTTTTTTTTCCCCC > > On Jun 15, 2011, at 7:36 AM, Kristian Ullrich wrote: > >> Hello Biostrings curators, >> >> again the question to you: >> >> Is there an easier way to solve the follwing: >> >> R-code: >> #################### >> #################### >> library(Biostrings) >> >> #example sequence >> seq.list=list() >> seq.list[1]="AAAAAAAAAATTTTTTTTTTGGGGGGGGGGCCCCCCCCCC" >> seq.list[2]="TTTTTTTTTTGGGGGGGGGGCCCCCCCCCCAAAAAAAAAA" >> fas.seq = DNAStringSet(unlist(seq.list)) >> >> #defining start and end points of subseq >> start1 = 1 >> end1 = 10 >> >> start2 = 21 >> end2 = 25 >> >> #creating first and second subseq >> first.subseq = subseq(fas.seq,start1,end1) >> second.subseq = subseq(fas.seq,start2,end2) >> >> new.seq = >> DNAStringSet(apply(sapply(list(first.subseq,second.subseq),as.chara cter),1,function(x) >> paste(x,collapse=""))) >> names(new.seq) = names(fas.seq) >> #################### >> #################### >> >> I basically want to combine subseqs from one DNAStringset, something >> like: >> >> subseq(DNAStringSet, start = c(start1,start2), end = c(end1,end2)) >> >> would be nice. >> >> Thank you in anticipation >> >> Kristian Ullrich >> -- >> Kristian Ullrich >> >> Leibniz Institute of Plant Biochemistry >> Weinberg 3 >> D-06120 Halle (Saale), Germany >> phone +49 345 5582 1221 >> fax +49 345 5582 1209 >> mail kullrich at ipb-halle.de >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD REPLY

Login before adding your answer.

Traffic: 973 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6