[Bioc-sig-seq] first remove low quality or adaptor
1
0
Entering edit mode
@herve-pages-1542
Last seen 20 hours ago
Seattle, WA, United States
Thanks for sending a reproducible example! This bug should be fixed in Biostrings release (2.20.4) and devel (2.21.10). The new versions of the package will normally propagate to the public repositories and become available via biocLite() in the next 24-36 hours. Cheers, H. On 11-09-23 11:35 AM, wang peter wrote: > > dear herve: > i send you the coding for the problem > rm(list=ls()) > fastqfile="query.fastq" #downloaded from > http://biocluster.ucr.edu/~tbackman/query.fastq > #################################### > ##Trimming Low Quality nucleotides## > #################################### > library(ShortRead) > reads <- readFastq(fastqfile); > seqs <- sread(reads); > #the low quality positions below the qualityCutoff will be replaced by "N" > qualityCutoff <- 20 > qual <- PhredQuality(quality(quality(reads))) # get quality score list > as PhredQuality > myqual_mat <- matrix(charToRaw(as.character(unlist(qual))), > nrow=length(qual), byrow=TRUE) # convert quality score to matrix > at <- myqual_mat < > charToRaw(as.character(PhredQuality(as.integer(qualityCutoff)))) > letter_subject <- DNAStringpasterep.int <http: rep.int="">("N", > width(seqs)[1]), collapse="")) > letter <- as(Views(letter_subject, start=1, end=rowSums(at)), > "DNAStringSet") > injectedseqs <- replaceLetterAt(seqs, at, letter) > #remove "N" on 5' and 3' end, see what is left in the middle of reads > seqsWithoutNend <-trimLRPatterns(Rpattern = letter_subject, Lpattern = > letter_subject,subject = injectedseqs) > nCount<-alphabetFrequency(seqsWithoutNend)[,"N"] > nDist<- table(nCount) > #filter all the reads with more than 2 "N" in the middle > nCutoff=2 > middleN <- nCount < nCutoff > reads<-reads[middleN] > trimmedCoords<-trimLRPatterns(Rpattern = letter_subject, Lpattern = > letter_subject,subject = injectedseqs,ranges=T) > trimmedCoords<-trimmedCoords[middleN] > trimmedReads <- narrow(reads, start=start(trimmedCoords), > end=end(trimmedCoords)) > PCR2rc<-DNAString("AGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGAAACAAA")#this > should be clipped > clippedCoords <- trimLRPatterns(Rpattern = PCR2rc, subject = > sread(trimmedReads), max.Rmismatch=0.1, with.Rindels=T,ranges=T) > clippedReads <- narrow(trimmedReads, start=start(clippedCoords), > end=end(clippedCoords)) > that is the result: > in solveUserSEW(width(x), start = start, end = end, width = width) : > solving row 686: 'allow.nonnarrowing' is FALSE and the supplied start > (0) is < 1 -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
Cancer Biostrings convert Cancer Biostrings convert • 1.1k views
ADD COMMENT
0
Entering edit mode
wang peter ★ 2.0k
@wang-peter-4647
Last seen 9.6 years ago
thank you for your guys hard working and this group helped me a lot shan gao [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
dear Dr. Hervé Pagès and do you know how to read the c source code of trimLRPatterns here is the hierachy of that function trimLRPatterns showMethods("trimLRPatterns") Biostrings:::.XStringSet.trimLRPatterns Biostrings:::.computeTrimStart or Biostrings:::.computeTrimEnd which.isMatchingStartingAt but i donot know how to continue to show its called functions? thx shan gao [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
> showMethods(which.isMatchingStartingAt, includeDefs=TRUE) and then > Biostrings:::.matchPatternAt From there, you will need the Biostrings source, in particular Biostrings/src/lowlevel_matching.c On Sep 26, 2011, at 11:04 PM, wang peter wrote: > dear Dr. Hervé Pagès > and do you know how to read the c source code of trimLRPatterns > here is the hierachy of that function > trimLRPatterns > showMethods("trimLRPatterns") > Biostrings:::.XStringSet.trimLRPatterns > Biostrings:::.computeTrimStart or Biostrings:::.computeTrimEnd > which.isMatchingStartingAt > but i donot know how to continue to show its called functions? > thx > shan gao > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/ > gmane.science.biology.informatics.conductor
ADD REPLY

Login before adding your answer.

Traffic: 884 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6