Hi,
Is it possible to use the ShortRead::qa2() function from a ShortReadQ object rather than a fastq file? I'm doing various trimming and manipulation using ShortRead so since the data is in memory, I don't want to write it out to read it back in again. This is possible with qa() but for qa2 I can't quite get my head around the various QAData classes and how they all fit together. What would the alternative to QAFastqSource be? Below is what I'm trying to do:
> dirPath <- system.file(package="ShortRead", "extdata", "E-MTAB-1147")
> fls <- dir(dirPath, "fastq.gz", full=TRUE)
> x1 <- qa(fls[1])
> head(x1[['perCycle']]$baseCall)
Cycle Base Count lane
1 1 A 2248 ERR127302_1_subset.fastq.gz
2 1 C 10283 ERR127302_1_subset.fastq.gz
3 1 G 3105 ERR127302_1_subset.fastq.gz
4 1 T 4344 ERR127302_1_subset.fastq.gz
15 1 N 20 ERR127302_1_subset.fastq.gz
19 2 A 4965 ERR127302_1_subset.fastq.gz
> head(x1[['perCycle']]$quality)
Cycle Quality Score Count lane
4 1 # 2 5 ERR127302_1_subset.fastq.gz
6 1 % 4 2 ERR127302_1_subset.fastq.gz
7 1 & 5 9 ERR127302_1_subset.fastq.gz
8 1 ' 6 6 ERR127302_1_subset.fastq.gz
13 1 , 11 3 ERR127302_1_subset.fastq.gz
14 1 - 12 6 ERR127302_1_subset.fastq.gz
> rfq <- yield(FastqSampler(fls[1], 1000000))
> x2 <- qa(rfq, lane='test')
> head(x2[['perCycle']]$baseCall)
Cycle Base Count lane
1 1 A 2248 test
2 1 C 10283 test
3 1 G 3105 test
4 1 T 4344 test
15 1 N 20 test
19 2 A 4965 test
> head(x2[['perCycle']]$quality)
Cycle Quality Score Count lane
4 1 # 2 5 test
6 1 % 4 2 test
7 1 & 5 9 test
8 1 ' 6 6 test
13 1 , 11 3 test
14 1 - 12 6 test
> ShortRead:::.plotCycleBaseCall(x2[['perCycle']]$baseCall)
> ShortRead:::.plotCycleQuality(x2[['perCycle']]$quality)
Using qa2:
> coll3 <- QACollate(QAFastqSource(fls[1]),
+ QANucleotideByCycle(),
+ QAQualityByCycle())
> x3 <- qa2(coll3, BPPARAM=SerialParam(), verbose=TRUE)
qa2,QACollate-method
qa2,QACollate1-method
qa2,QAFastqSource-method
qa2,FastqSampler-method
qa2,QANucleotideByCycle-method
qa2,QAQualityByCycle-method
flag,QASource-method
flag,ANY-method
flag,ANY-method
> x3@listData$QANucleotideByCycle@values
DataFrame with 335 rows and 4 columns
Id Cycle Base Count
<factor> <integer> <factor> <integer>
1 1 1 A 2248
2 1 1 C 10283
3 1 1 G 3105
4 1 1 T 4344
5 1 1 N 20
... ... ... ... ...
331 1 72 A 4428
332 1 72 C 5376
333 1 72 G 5721
334 1 72 T 4467
335 1 72 N 8
> x3@listData$QAQualityByCycle@values
DataFrame with 2628 rows and 5 columns
Id Cycle Quality Score Count
<factor> <integer> <factor> <integer> <integer>
1 1 1 # 2 5
2 1 1 % 4 2
3 1 1 & 5 9
4 1 1 ' 6 6
5 1 1 , 11 3
... ... ... ... ... ...
2624 1 72 E 36 1181
2625 1 72 F 37 763
2626 1 72 G 38 1672
2627 1 72 H 39 1673
2628 1 72 I 40 1398
> coll4 <- QACollate(QAData(rfq),
+ QANucleotideByCycle(),
+ QAQualityByCycle())
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘QACollate’ for signature ‘"QAData"’
Thanks,
Phil

Great thanks for the clarification and hints Martin.