2
0
Entering edit mode
danova_fr ▴ 20
@danova_fr-7502
Last seen 7.5 years ago
Andorra

I´m processing a fastq file of reads that have already paired. Trying to get the GC content i get this error ???

> fq <- readFastq(file)
> fq
length: 50545 reads; width: 79..503 cycles
> gc <- gcContent(fq)
Warning messages:
1: In if (nbins > .Machine\$integer.max) stop("attempt to make a table with >= 2^31 elements") :
the condition has length > 1 and only the first element will be used
2: In if (is.na(nbins)) stop("invalid value of 'nbins'") :
the condition has length > 1 and only the first element will be used

Looking at the fastqfile

> qual <- as(quality(fq), "matrix")
> qual
[,501] [,502] [,503]
[1,]     NA     NA     NA
[2,]     NA     NA     NA


I´ve noticed that the qual object contains NA values at the end of the sequence. Any idea ??

I have merged pairs using bbmerge first but don´t see any problem in the fastq file. I have tried other programs and fastq file look ok.

0
Entering edit mode

gcContent is not a function in the ShortRead package. Where did you get it from? Type gcContent at the command line (no parentheses()) to see its source. Add this information to your question.

The NA's are present because you have asked to see the quality scores as a matrix. A matrix has to be rectangular (each row with same number of reads) so the quality scores of short reads have been 'padded' with NA values.

0
Entering edit mode
danova_fr ▴ 20
@danova_fr-7502
Last seen 7.5 years ago
Andorra

Sorry,

Got it from the manual.

gcContent <- function(x)
{
tabulate(1 + gcPerRead, 1 + wd)
}

tabulate(1 + gcPerRead, 1 + wd) seems to returning the previous error

0
Entering edit mode

'The manual' is not really descriptive enough! I think you can change

wd <- unique(width(sread(x)))

to

wd <- max(width(sread(x)), 0L)

Since your reads are different widths, it doesn't necessarily make sense to count the number of G and C nucleotides; maybe it is more helpful to calculate GC frequency of each read

gc <- letterFrequency(sread(x), "GC", as.prob=TRUE)
hist(gc)

0
Entering edit mode
danova_fr ▴ 20
@danova_fr-7502
Last seen 7.5 years ago
Andorra

great,

thanks so much,

david