Entering edit mode
robmaz77 ▴ 20
@dc2bd3d5
Last seen 18 months ago

Is there a recommended way of converting FASTQ qualities? E.g., I read a phred64-encoded file with FastqStreamer, which gives me a ShortReadQ object with a quality of type SFastqQuality. After some manipulation I want to write it as phred33. For some reason I had thought I could write like this

> writeFastq(fq,file="foo.fq.gz",qualityType="FastqQuality")


No idea where I got that from, but it does not seem to work anyway.

I can come up with something like this

> fq2 <- ShortReadQ(sread=sread(fq),
id=id(fq),
quality=relist(BString(intToUtf8(t(as(quality(fq),"matrix"))+33L)),
quality(fq)@quality))

# and in fact
> identical(as(quality(fq),"matrix"),as(quality(fq2),"matrix"))
[1] TRUE
> writeFastq(fq2,file="bar.fq.gz")


but this is hardly the pinnacle of elegance?

Entering edit mode
@martin-morgan-1513
Last seen 13 days ago
United States

I think you can as(quality(fq), "PhredQuality").

Entering edit mode

Hm, that seems to force the encoding on the existing BStringSet, but not re-encode the score:

> range(as(quality(fq),"matrix"))
[1]  2 41
> range(as(as(quality(fq),"PhredQuality"),"matrix"))
[1] 33 72


It also appears to be a different class:

> quality(fq)
class: SFastqQuality
quality:
BStringSet object of length 1000:
...
> as(quality(fq),"PhredQuality")
PhredQuality object of length 1000:
...


Apparently it can be used instead of a BstringSet in the constructor,

> fq2 <- ShortReadQ(id=id(fq),sread=sread(fq),quality=as(quality(fq),"PhredQuality"))
> fq2
length: 1000 reads; width: 100 cycles


but is detected again as solexa64

> quality(fq2)
class: SFastqQuality
quality:
PhredQuality object of length 1000:
...


regaining the original scores:

> range(as(quality(fq2),"matrix"))
[1]  2 41

Entering edit mode

Probably not worth wasting too much time on this, since the solution I had basically works.

For the record, here is a modified version that is slightly faster by avoiding the transpose and also does not rely on autodetection of the original format and rectangularity:

phred64To33 <- function(fq) {
id=id(fq),
quality=FastqQuality(
relist(
as(intToUtf8(utf8ToInt(as.character(unlist(quality(quality(fq)))))-64L+33L),
"BString"),
quality(quality(fq)))))
}