DEXSeq Paired-end analysis (new issues)
1
0
Entering edit mode
Bio152 ▴ 150
@bio152-5954
Last seen 8.8 years ago
Hi Simon, The sort code worked just fine. I also ran a diagnostic on the *.bam files and no error came up. But when I tried to use the sorted *.sam.bam file with the dexseq_count.py program, I got the following messages: [mlinan@s59-14 MISC]$ python dexseq_count.py -p yes -s no hs71.gff ahBC010.sam.b am BC010.counts Traceback (most recent call last): File "dexseq_count.py", line 132, in <module> for af, ar in HTSeq.pair_SAM_alignments( HTSeq.SAM_Reader( sam_file ) ): File "/packages/python/python-2.7.3/lib/python2.7/site- packages/HTSeq/__init__ .py", line 610, in pair_SAM_alignments for almnt in alignments: File "/packages/python/python-2.7.3/lib/python2.7/site- packages/HTSeq/__init__ .py", line 549, in __iter__ algnt = SAM_Alignment.from_SAM_line( line ) File "_HTSeq.pyx", line 1274, in HTSeq._HTSeq.SAM_Alignment.from_SAM_line (src /_HTSeq.c:22184) ValueError: ('SAM line does not contain at least 11 tab-delimited fields.', 'lin e 1 of file ahBC010.sam.bam') When I opened the *sam.bam file I found the below in the first few lines. The entire file is a collection of random symbols with accents marks and numbers. ‹�����ÿ�BC�ò¥"½kAÆÇœÆ\"T "WØîÍ×ÎÎÌ™À].’ø‘AŽÍíÜÞÆÜîåvsI!‚ h)6vÖ‚MÁ.¶"ˆþ‚] …8;"ìY ·÷ð›gß÷}^fgËçNJ�ÔæÊë+5\EåÕÛµv’ ‚(ö35Y_½[^Õ奕q%aÈ#¸ (ǘ2F]îòãSŒÇ°W`b1 É L ÆÚ)v Fœº’¹¨ÀÌbÂ(¦’Ø˱D"¹œÑ‚òœ Thanks, Margaret [[alternative HTML version deleted]]
• 2.4k views
ADD COMMENT
0
Entering edit mode
Devon Ryan ▴ 200
@devon-ryan-6054
Last seen 9.0 years ago
Germany
Hi Margaret, It's expecting a SAM file, not a BAM file: samtools view -h ahBC010.sam.bam | dexseq_count.py -p yes -s no hs71.gff - ahBC010.counts Note that - specifies reading from the standard input rather than a file. Cheers, Devon ____________________________________________ Devon Ryan, Ph.D. Email: dpryan at dpryan.com Tel: +49 (0)178 298-6067 Molecular and Cellular Cognition Lab German Centre for Neurodegenerative Diseases (DZNE) Ludwig-Erhard-Allee 2 53175 Bonn, Germany On Oct 6, 2013, at 9:41 PM, Margaret Linan wrote: > Hi Simon, > > The sort code worked just fine. I also ran a diagnostic on the *.bam files > and no error came up. But when I tried to use the sorted > *.sam.bam file with the dexseq_count.py program, I got the following > messages: > > > [mlinan at s59-14 MISC]$ python dexseq_count.py -p yes -s no hs71.gff > ahBC010.sam.b am BC010.counts > Traceback (most recent call last): > File "dexseq_count.py", line 132, in <module> > for af, ar in HTSeq.pair_SAM_alignments( HTSeq.SAM_Reader( sam_file ) ): > File > "/packages/python/python-2.7.3/lib/python2.7/site- packages/HTSeq/__init__ > .py", line 610, in pair_SAM_alignments > for almnt in alignments: > File > "/packages/python/python-2.7.3/lib/python2.7/site- packages/HTSeq/__init__ > .py", line 549, in __iter__ > algnt = SAM_Alignment.from_SAM_line( line ) > File "_HTSeq.pyx", line 1274, in HTSeq._HTSeq.SAM_Alignment.from_SAM_line > (src /_HTSeq.c:22184) > ValueError: ('SAM line does not contain at least 11 tab-delimited fields.', > 'lin e 1 of file ahBC010.sam.bam') > > When I opened the *sam.bam file I found the below in the first few lines. > The entire file is a collection of random symbols with accents marks and > numbers. > > > ????????BC???"?kA????\"T > > "W?????????].????A?????????vsI!?? h)6v??M?.?"???] ?8;"?Y > > ????g??}^fg??NJ?????+5\E????v? > > ?(?35Y_?[^????q%a?#? (??2F]???S????W`b1 > > > ? > > L > > ??)v > > F???????b?(?????D"?????? > > > > Thanks, > > Margaret > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
I will take this chance to mention that the python scripts from dexseq from the most recent versions accept also files ordered by position, rather than only sorted by read name as well as bam files. You can specify this in the parameters '-f' and '-r'. The bam file reader will require also a recent version of HTSeq (0.5.4p4). The option of reading files sorted by position will require a bit more RAM, however I was able to run it for a bam file containing reads from a full lane in my 8 Gb RAM laptop. Hope it is useful! Alejandro > Hi Margaret, > > It's expecting a SAM file, not a BAM file: > > samtools view -h ahBC010.sam.bam | dexseq_count.py -p yes -s no hs71.gff - ahBC010.counts > > Note that - specifies reading from the standard input rather than a file. > > Cheers, > Devon > > ____________________________________________ > Devon Ryan, Ph.D. > Email: dpryan at dpryan.com > Tel: +49 (0)178 298-6067 > Molecular and Cellular Cognition Lab > German Centre for Neurodegenerative Diseases (DZNE) > Ludwig-Erhard-Allee 2 > 53175 Bonn, Germany > > On Oct 6, 2013, at 9:41 PM, Margaret Linan wrote: > >> Hi Simon, >> >> The sort code worked just fine. I also ran a diagnostic on the *.bam files >> and no error came up. But when I tried to use the sorted >> *.sam.bam file with the dexseq_count.py program, I got the following >> messages: >> >> >> [mlinan at s59-14 MISC]$ python dexseq_count.py -p yes -s no hs71.gff >> ahBC010.sam.b am BC010.counts >> Traceback (most recent call last): >> File "dexseq_count.py", line 132, in <module> >> for af, ar in HTSeq.pair_SAM_alignments( HTSeq.SAM_Reader( sam_file ) ): >> File >> "/packages/python/python-2.7.3/lib/python2.7/site- packages/HTSeq/__init__ >> .py", line 610, in pair_SAM_alignments >> for almnt in alignments: >> File >> "/packages/python/python-2.7.3/lib/python2.7/site- packages/HTSeq/__init__ >> .py", line 549, in __iter__ >> algnt = SAM_Alignment.from_SAM_line( line ) >> File "_HTSeq.pyx", line 1274, in HTSeq._HTSeq.SAM_Alignment.from_SAM_line >> (src /_HTSeq.c:22184) >> ValueError: ('SAM line does not contain at least 11 tab-delimited fields.', >> 'lin e 1 of file ahBC010.sam.bam') >> >> When I opened the *sam.bam file I found the below in the first few lines. >> The entire file is a collection of random symbols with accents marks and >> numbers. >> >> >> ????????BC???"?kA????\"T >> >> "W?????????].????A?????????vsI!?? h)6v??M?.?"???] ?8;"?Y >> >> ????g??}^fg??NJ?????+5\E????v? >> >> ?(?35Y_?[^????q%a?#? (??2F]???S????W`b1 >> >> >> ? >> >> L >> >> ??)v >> >> F???????b?(?????D"?????? >> >> >> >> Thanks, >> >> Margaret >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 469 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6