ShortRead - readAligned() with bowtie & qual
1
0
Entering edit mode
Marc Noguera ▴ 100
@marc-noguera-3883
Last seen 9.6 years ago
Dear list, I am trying to do some quality assessment on solexa runs using Bioc&shortreads. I am using bowtie as a mapper, which yields bowtie-formatted output with fastq scores for alignment, such as: > HWUSI-EAS621_91022_1_100_1938_1667 + chr15 53573544 > CAGTCTCCCAAAGTACTGGGATAATAGGTGTGAGACTCC > DPYWYWYYWWWWPWWYWTVWWYWWWYYWYWXWBBBBBBB 0 34:C>A,36:A>T > HWUSI-EAS621_91022_1_100_1938_1823 - chr18 34747447 > ACCCGGGAGTTGGGCTGCTTAGTGGCTGGACTCTCTTCC > BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 34:T>G > HWUSI-EAS621_91022_1_100_1938_608 + chr19 35665132 > CAGCTGCTCAGGAGGCTGAGGCAGGAGAATCGCTTGAGC > DMTTTSRSTUTTTTTUTTTTTTTTTTQSSBBBBBBBBBB 2 > HWUSI-EAS621_91022_1_100_1938_1207 + chr22 30069585 > TCTGGGCCGTGGGGAGGCTCCTCCTTGGCTGATGGCGCC > DMTUTTRUTPTSTTUUUTSSTTUTBBBBBBBBBBBBBBB 0 35:T>C,37:A>C > HWUSI-EAS621_91022_1_100_1938_222 - chr20 61020239 > GCCTGGGCCTCCCGAAGTGCTGTGGTTACAGGCATGAGC > BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 2 25:A>G,34:C>G > HWUSI-EAS621_91022_1_100_1938_1562 + chr15 84916971 > TGGGTTTCACCATGGTGGCCAGGCTGGTCTCAAACTCCT > DNUVUWWWWWWWWWWWWWUWWWWWWWVBBBBBBBBBBBB 0 > HWUSI-EAS621_91022_1_100_1938_1290 - chr9 120742911 > AGCCCAAGAGAGCCTTCTCCTCGACCATTACCACCAATG > BBBBBBBBBBBBBBBBSWRLPWUWRSWUKTWXXWXWWND 0 33:C>A,35:T>C When I try to read this file with the readAligned() function with: > aln <- readAligned("/path/",pattern="test.fastq.bwt",type="Bowtie",qualityTyp e='FastqQuality') I obtain an alignedread object, which includes quality data. > quality(aln) > > quality(aln) > class: SFastqQuality > quality: > A BStringSet instance of length 3331015 > width seq > [1] 35 BBB=B?:AA:@?@>?B@@AA@@A;>@4>>7922=> > ... ... ... > [3331015] 33 %%/<<<1;:<<:<<<<995<<<:<::<<<:<<< However, when I try to use this qualities to plot them I obtain "NA" values > > alignQuality(aln) > class: NumericQuality > quality: NA NA ... NA NA (3331015 total) So, I guess there is some kind of problem when transforming to ASCII to quality numerical values. I have also tried with SFastqQuality type to read the input, with no succes. What am I doing wrong? thanks in advance Marc -- ----------------------------------------------------- Marc Noguera i Julian, PhD Genomics unit / Bioinformatics Institut de Medicina Preventiva i Personalitzada del C?ncer (IMPPC) B-10 Office Carretera de Can Ruti Cam? de les Escoles s/n 08916 Badalona, Barcelona
Alignment Alignment • 813 views
ADD COMMENT
0
Entering edit mode
@kasper-daniel-hansen-2979
Last seen 10 months ago
United States
alignQuality is not the same as quality. quality is the qualities of the reads (which you are interested in). alignQuality is the quality if the _alignment_, which Bowtie does not give (one could say that a perfect match alignment is better than a 1 mismatch alignment and so on). You should also have noticed that alignQuality is a vector of numeric, but that there is only one element per read, whereas the qualities have one element per read per base. So you need to operate on quality(aln) Kasper On Jan 20, 2010, at 6:36 AM, Marc Noguera wrote: > Dear list, > I am trying to do some quality assessment on solexa runs using > Bioc&shortreads. > I am using bowtie as a mapper, which yields bowtie-formatted output with > fastq scores for alignment, such as: >> HWUSI-EAS621_91022_1_100_1938_1667 + chr15 53573544 >> CAGTCTCCCAAAGTACTGGGATAATAGGTGTGAGACTCC >> DPYWYWYYWWWWPWWYWTVWWYWWWYYWYWXWBBBBBBB 0 34:C>A,36:A>T >> HWUSI-EAS621_91022_1_100_1938_1823 - chr18 34747447 >> ACCCGGGAGTTGGGCTGCTTAGTGGCTGGACTCTCTTCC >> BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 34:T>G >> HWUSI-EAS621_91022_1_100_1938_608 + chr19 35665132 >> CAGCTGCTCAGGAGGCTGAGGCAGGAGAATCGCTTGAGC >> DMTTTSRSTUTTTTTUTTTTTTTTTTQSSBBBBBBBBBB 2 >> HWUSI-EAS621_91022_1_100_1938_1207 + chr22 30069585 >> TCTGGGCCGTGGGGAGGCTCCTCCTTGGCTGATGGCGCC >> DMTUTTRUTPTSTTUUUTSSTTUTBBBBBBBBBBBBBBB 0 35:T>C,37:A>C >> HWUSI-EAS621_91022_1_100_1938_222 - chr20 61020239 >> GCCTGGGCCTCCCGAAGTGCTGTGGTTACAGGCATGAGC >> BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 2 25:A>G,34:C>G >> HWUSI-EAS621_91022_1_100_1938_1562 + chr15 84916971 >> TGGGTTTCACCATGGTGGCCAGGCTGGTCTCAAACTCCT >> DNUVUWWWWWWWWWWWWWUWWWWWWWVBBBBBBBBBBBB 0 >> HWUSI-EAS621_91022_1_100_1938_1290 - chr9 120742911 >> AGCCCAAGAGAGCCTTCTCCTCGACCATTACCACCAATG >> BBBBBBBBBBBBBBBBSWRLPWUWRSWUKTWXXWXWWND 0 33:C>A,35:T>C > When I try to read this file with the readAligned() function with: >> aln <- > readAligned("/path/",pattern="test.fastq.bwt",type="Bowtie",qualityT ype='FastqQuality') > > I obtain an alignedread object, which includes quality data. >> quality(aln) >>> quality(aln) >> class: SFastqQuality >> quality: >> A BStringSet instance of length 3331015 >> width seq >> [1] 35 BBB=B?:AA:@?@>?B@@AA@@A;>@4>>7922=> >> ... ... ... >> [3331015] 33 %%/<<<1;:<<:<<<<995<<<:<::<<<:<<< > However, when I try to use this qualities to plot them I obtain "NA" values >>> alignQuality(aln) >> class: NumericQuality >> quality: NA NA ... NA NA (3331015 total) > So, I guess there is some kind of problem when transforming to ASCII to > quality numerical values. I have also tried with SFastqQuality type to > read the input, with no succes. > > What am I doing wrong? > > thanks in advance > Marc > > -- > > ----------------------------------------------------- > Marc Noguera i Julian, PhD > Genomics unit / Bioinformatics > Institut de Medicina Preventiva i Personalitzada > del C?ncer (IMPPC) > B-10 Office > Carretera de Can Ruti > Cam? de les Escoles s/n > 08916 Badalona, Barcelona > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 905 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6