Entering edit mode
Nora Rieber
▴
40
@nora-rieber-3722
Last seen 10.3 years ago
Dear all,
I'm using the ShortRead package to read in my MAQ alignment output
files
with readAligned. Then I extract a subset of reads mapping to one
chromosome:
aln_Pat_chr<-aln_Pat[chromosome(aln_Pat)==chromosomeString]
and extend the reads with extendReads:
aln_Pat_chr_extended<-extendReads(aln_Pat_chr, seqLen=350)
However, extendReads seems to misplace the reads. Here's the output of
aln_Pat_chr:
>aln_Pat_chr
class: AlignedRead
length: 1025425 reads; width: 36 cycles
chromosome: gi|51511732|ref|NC_000016.8|NC_000016
gi|51511732|ref|NC_000016.8|NC_000016 ...
gi|51511732|ref|NC_000016.8|NC_000016
gi|51511732|ref|NC_000016.8|NC_000016
position: 553 553 ... 88698835 88762373
strand: - - ... + -
alignQuality: IntegerQuality
alignData varLabels: nMismatchBestHit mismatchQuality nExactMatch24
nOneMismatch24
and the output of aln_Pat_chr_extended:
> aln_Pat_chr_extended
$`gi|51511732|ref|NC_000016.8|NC_000016`
IRanges of length 1025426
start end width
[1] 553 902 350
[2] 239 588 350
[3] 239 588 350
[4] 239 588 350
[5] 239 588 350
[6] 239 588 350
[7] 239 588 350
[8] 239 588 350
[9] 239 588 350
... ... ... ...
[1025418] 239 588 350
[1025419] 239 588 350
[1025420] 239 588 350
[1025421] 239 588 350
[1025422] 239 588 350
[1025423] 239 588 350
[1025424] 239 588 350
[1025425] 239 588 350
[1025426] 239 588 350
While the highest mapped position is 88762373 before extension, it is
902 after extension. This happens with any dataset I use the function
with. Does anybody observe the same thing?
I tried to recreate this using the maq alignment data coming with the
ShortRead package, but using extendReads on it throws an error:
>testaln<-readAligned("~/R/2.10/ShortRead/extdata/maq","out.aln.2.txt"
,type="MAQMapview")
>readsextended<-extendReads(testaln,seqLen=40)
Error in s1[[ineg]] : attempt to select less than one element.
Am I doing something wrong?
This is my session info:
sessionInfo()
R version 2.10.0 (2009-10-26)
x86_64-unknown-linux-gnu
locale:
[1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C
[3] LC_TIME=de_DE.UTF-8 LC_COLLATE=de_DE.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=de_DE.UTF-8
[7] LC_PAPER=de_DE.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] chipseq_0.2.0 ShortRead_1.4.0 lattice_0.17-26
BSgenome_1.14.1
[5] Biostrings_2.14.5 IRanges_1.4.6
loaded via a namespace (and not attached):
[1] Biobase_2.6.0 grid_2.10.0 hwriter_1.1 tools_2.10.0
Best,
Nora