Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 9.6 years ago
I have an issue where I try to do something rather simple but run into
a strange issue.
The application is rather simple. I have a "Barcode" sequence in my
amplicon sequences using Illumina MiSeq, that follows a certain
pattern.
What I want to achieve is to identify the correct reads containing a
barcode and then crop the reads to cover only the barcode and then
save it back to a FASTQ file retaining quality and ID information of
each barcode.
Everything works well in this code except for the narrow command.
Here is my code:
rawReads <- readFastq("ChAT12-13_GGACTCCT-
TATCCTCT_L001_R1_001_50k.fastq")
rawReads <- rawReads[nFilter()(rawReads)] #Removes reads containing
N's
rawReads <- rawReads[width(rawReads) > 55] #drops reads shorter than
55 nt
refBC <- DNAString("TCAGCVHDBVHDBVHDBVHDBVHDBAATTA")
hits <- vcountPattern(refBC, sread(rawReads),
max.mismatch=0,fixed=FALSE, with.indels=FALSE)
selected <- rawReads[as.logical(hits)]
trimList <- vmatchPattern(refBC, sread(selected),
max.mismatch=0,fixed=FALSE)
selected <- narrow(selected, start = unlist(startIndex(trimList)), end
= unlist(endIndex(trimList)))
>From the last row I get the message:
> selected <- narrow(selected, start = unlist(startIndex(trimList)),
end = unlist(endIndex(trimList)))
Error in solveUserSEW(width(x), start = start, end = end, width =
width) :
'start', 'end' or 'width' is longer than 'refwidths'
I cannot figure out how this is possible. The ???endIndex" should
never be possible to be larger then the length of the sequence that is
analyzed in the ???vmatchPattern", correct? Am I missing something
fundamental?
-- output of sessionInfo():
R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils datasets
methods base
other attached packages:
[1] ShortRead_1.22.0 GenomicAlignments_1.0.0 BSgenome_1.32.0
Rsamtools_1.16.0 GenomicRanges_1.16.2
[6] GenomeInfoDb_1.0.2 Biostrings_2.32.0 XVector_0.4.0
IRanges_1.22.3 BiocParallel_0.6.0
[11] BiocGenerics_0.10.0
loaded via a namespace (and not attached):
[1] BatchJobs_1.2 BBmisc_1.6 Biobase_2.24.0
bitops_1.0-6 brew_1.0-6 codetools_0.2-8
[7] DBI_0.2-7 digest_0.6.4 fail_1.2
foreach_1.4.2 grid_3.1.0 hwriter_1.3
[13] iterators_1.0.7 lattice_0.20-29 latticeExtra_0.6-26
plyr_1.8.1 RColorBrewer_1.0-5 Rcpp_0.11.1
[19] RSQLite_0.11.4 sendmailR_1.1-2 stats4_3.1.0
stringr_0.6.2 tools_3.1.0 zlibbioc_1.10.0
--
Sent via the guest posting facility at bioconductor.org.