ShortRead: how to convert qualities?
1
0
Entering edit mode
robmaz77 ▴ 20
@dc2bd3d5
Last seen 3.1 years ago

Is there a recommended way of converting FASTQ qualities? E.g., I read a phred64-encoded file with FastqStreamer, which gives me a ShortReadQ object with a quality of type SFastqQuality. After some manipulation I want to write it as phred33. For some reason I had thought I could write like this

> writeFastq(fq,file="foo.fq.gz",qualityType="FastqQuality")

No idea where I got that from, but it does not seem to work anyway.

I can come up with something like this

> fq2 <- ShortReadQ(sread=sread(fq),
                    id=id(fq),
                    quality=relist(BString(intToUtf8(t(as(quality(fq),"matrix"))+33L)),
                                   quality(fq)@quality))

# and in fact
> identical(as(quality(fq),"matrix"),as(quality(fq2),"matrix"))
[1] TRUE
> writeFastq(fq2,file="bar.fq.gz")

but this is hardly the pinnacle of elegance?

ShortRead FastqQuality • 1.3k views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 5 days ago
United States

I think you can as(quality(fq), "PhredQuality").

ADD COMMENT
0
Entering edit mode

Hm, that seems to force the encoding on the existing BStringSet, but not re-encode the score:

> range(as(quality(fq),"matrix"))
[1]  2 41
> range(as(as(quality(fq),"PhredQuality"),"matrix"))
[1] 33 72

It also appears to be a different class:

> quality(fq)
class: SFastqQuality
quality:
BStringSet object of length 1000:
...
> as(quality(fq),"PhredQuality")
PhredQuality object of length 1000:
...

Apparently it can be used instead of a BstringSet in the constructor,

> fq2 <- ShortReadQ(id=id(fq),sread=sread(fq),quality=as(quality(fq),"PhredQuality"))
> fq2
class: ShortReadQ
length: 1000 reads; width: 100 cycles

but is detected again as solexa64

> quality(fq2)
class: SFastqQuality
quality:
PhredQuality object of length 1000:
...

regaining the original scores:

> range(as(quality(fq2),"matrix"))
[1]  2 41
ADD REPLY
0
Entering edit mode

Probably not worth wasting too much time on this, since the solution I had basically works.

For the record, here is a modified version that is slightly faster by avoiding the transpose and also does not rely on autodetection of the original format and rectangularity:

phred64To33 <- function(fq) {
  ShortReadQ(
    id=id(fq),
    sread=sread(fq),
    quality=FastqQuality(
      relist(
        as(intToUtf8(utf8ToInt(as.character(unlist(quality(quality(fq)))))-64L+33L),
             "BString"),
        quality(quality(fq)))))
}
ADD REPLY

Login before adding your answer.

Traffic: 970 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6