how does ShortRead function identify score system
1
0
Entering edit mode
wang peter ★ 2.0k
@wang-peter-4647
Last seen 9.6 years ago
dear ALL: i have many RNA-seq data. some use Phred+33 system to score the quality some use Phred+63 i found no parameter to tell the functions which one should be. but it still works can you tell me which function can identify them. such is the coding: reads <- readFastq(fastqfile); qual <- FastqQuality(quality(quality(reads))); #qual <- PhredQuality(quality(quality(reads)));# readM <- as(qual, "matrix");#??R's maximum vector length is 2^31 - 1 pdf(file="boxplot.pdf"); # Save box plot as boxplot.pdf in current folder boxplot(as.data.frame((readM)), outline = FALSE, main="Per Cycle Read Quality", xlab="Cycle", ylab="Phred Quality"); # build per cycle boxplot -- shan gao Room 231(Dr.Fei lab) Boyce Thompson Institute Cornell University Tower Road, Ithaca, NY 14853-1801 Office phone: 1-607-254-1267(day) Official email:sg839 at cornell.edu Facebook:http://www.facebook.com/profile.php?id=100001986532253
cycle cycle • 1.1k views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 5 days ago
United States
On 09/19/2012 07:57 PM, wang peter wrote: > dear ALL: > i have many RNA-seq data. > some use Phred+33 system to score the quality > some use Phred+63 > > i found no parameter to tell the functions which one should be. > but it still works > can you tell me which function can identify them. On the help page ?readFastq it says ...: Additional arguments. In particular, 'qualityType' and 'filter': qualityType: Representation to be used for quality scores, must be one of 'Auto' (choose Phred-like if any character is ASCII-encoded as less than 59) 'FastqQuality' (Phred-like encoding), 'SFastqQuality' (Illumina encoding). so readFastq(fastqfile, qualityType="FastqQuality") over-rides the 'Auto' selection. In more detail this is implemented in the 'ShortReadQ' method for "DNAStringSet", "BStringSet", "BStringSet", below, where the heuristic is just a scan of the first 1000 reads for quality scores below 59 alf <- alphabetFrequency(head(quality, 1000), collapse = TRUE) if (any(alf) && min(which(alf != 0)) < 59) { FastqQuality } else SFastqQuality The choice is only between FastqQuality and SFastqQuality, to create PhredQuality one has to do what you do below. Martin > selectMethod("ShortReadQ", + c(sread="DNAStringSet", quality="BStringSet", id="BStringSet")) Method Definition: function (sread, quality, id, ...) { .local <- function (sread, quality, id, ..., qualityType = c("Auto", "FastqQuality", "SFastqQuality"), filter = srFilter(), withIds = TRUE) { if (!missing(filter)) .check_type_and_length(filter, "SRFilter", NA) tryCatch({ qualityType <- match.arg(qualityType) }, error = function(err) { .throw(SRError("UserArgumentMismatch", conditionMessage(err))) }) tryCatch({ qualityFunc <- switch(qualityType, Auto = { alf <- alphabetFrequency(head(quality, 1000), collapse = TRUE) if (any(alf) && min(which(alf != 0)) < 59) { FastqQuality } else SFastqQuality }, SFastqQuality = SFastqQuality, FastqQuality = FastqQuality) quality <- qualityFunc(quality) srq <- if (withIds) ShortReadQ(sread, quality, id) else ShortReadQ(sread, quality) if (!missing(filter)) srq <- srq[filter(srq)] srq }, error = function(err) { .throw(SRError("IncompatibleTypes", "message: %s", conditionMessage(err))) }) } .local(sread, quality, id, ...) } <environment: namespace:shortread=""> Signatures: sread quality id target "DNAStringSet" "BStringSet" "BStringSet" defined "DNAStringSet" "BStringSet" "BStringSet" > > such is the coding: > reads <- readFastq(fastqfile); > > qual <- FastqQuality(quality(quality(reads))); > #qual <- PhredQuality(quality(quality(reads)));# > > readM <- as(qual, "matrix");#??R's maximum vector length is 2^31 - 1 > pdf(file="boxplot.pdf"); # Save box plot as boxplot.pdf in current folder > boxplot(as.data.frame((readM)), outline = FALSE, main="Per Cycle Read > Quality", xlab="Cycle", ylab="Phred Quality"); > # build per cycle boxplot > -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
ADD COMMENT

Login before adding your answer.

Traffic: 661 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6