DEXSeq update
2
0
Entering edit mode
Bio152 ▴ 150
@bio152-5954
Last seen 8.1 years ago
Hi Simon, It appears that one of my regular unsorted SAM files is truncated. Though the sort works fine and samtools view does not detect truncation in my file.sam.bam, I still get error messages after attempting to generate counts files. samtools view -h file.sam.bam | python dexseq_count.py file.gff - file.counts Do you think that the truncation may be behind the problems? I can't check the structure of my SAM.BAM file because its all symbols, though when I use samtools view, the contents runs across the screen, and I am unable to pinpoint any irregularity. Thanks, Margaret [[alternative HTML version deleted]]
• 1.2k views
ADD COMMENT
0
Entering edit mode
Devon Ryan ▴ 200
@devon-ryan-6054
Last seen 8.3 years ago
Germany
It would be helpful if you reported the actual error messages. ____________________________________________ Devon Ryan, Ph.D. Email: dpryan at dpryan.com Tel: +49 (0)178 298-6067 Molecular and Cellular Cognition Lab German Centre for Neurodegenerative Diseases (DZNE) Ludwig-Erhard-Allee 2 53175 Bonn, Germany On Oct 6, 2013, at 10:51 PM, Margaret Linan wrote: > Hi Simon, > > It appears that one of my regular unsorted SAM files is truncated. > > Though the sort works fine and samtools view does not detect truncation > in my file.sam.bam, I still get error messages after attempting to generate > counts files. > > samtools view -h file.sam.bam | python dexseq_count.py file.gff - > file.counts > > Do you think that the truncation may be behind the problems? > > I can't check the structure of my SAM.BAM file because its all symbols, > though when I use samtools view, the contents runs across the screen, and I > am unable to pinpoint any irregularity. > > Thanks, > Margaret > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
After properly sorting, using Simons method and then using the dexseq count program I get error messages These are a sample of the error messages: UserWarning: Read HWI-ST522:121:D0V1CACXX:8:1106:20332:74088 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?) "which could not be found. (Is the SAM file properly sorted?)" ) /packages/python/python-2.7.3/lib/python2.7/site- packages/HTSeq/__init__.py:598: UserWarning: Read HWI-ST522:121:D0V1CACXX:8:1106:21152:96414 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?) "which could not be found. (Is the SAM file properly sorted?)" ) /packages/python/python-2.7.3/lib/python2.7/site- packages/HTSeq/__init__.py:598: UserWarning: Read HWI-ST522:121:D0V1CACXX:8:1106:2563:90289 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?) "which could not be found. (Is the SAM file properly sorted?)" ) /packages/python/python-2.7.3/lib/python2.7/site- packages/HTSeq/__init__.py:598: UserWarning: Read HWI-ST522:121:D0V1CACXX:8:1106:2842:88054 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?) "which could not be found. (Is the SAM file properly sorted?)" ) /packages/python/python-2.7.3/lib/python2.7/site- packages/HTSeq/__init__.py:598: UserWarning: Read HWI-ST522:121:D0V1CACXX:8:1106:6895:17883 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?) "which could not be found. (Is the SAM file properly sorted?)" ) /packages/python/python-2.7.3/lib/python2.7/site- packages/HTSeq/__init__.py:598: UserWarning: Read HWI-ST522:121:D0V1CACXX:8:1106:7069:24372 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?) "which could not be found. (Is the SAM file properly sorted?)" ) Thanks, Margaret On Sun, Oct 6, 2013 at 2:29 PM, Devon Ryan <dpryan@dpryan.com> wrote: > It would be helpful if you reported the actual error messages. > > ____________________________________________ > Devon Ryan, Ph.D. > Email: dpryan@dpryan.com > Tel: +49 (0)178 298-6067 > Molecular and Cellular Cognition Lab > German Centre for Neurodegenerative Diseases (DZNE) > Ludwig-Erhard-Allee 2 > 53175 Bonn, Germany > > On Oct 6, 2013, at 10:51 PM, Margaret Linan wrote: > > > Hi Simon, > > > > It appears that one of my regular unsorted SAM files is truncated. > > > > Though the sort works fine and samtools view does not detect truncation > > in my file.sam.bam, I still get error messages after attempting to > generate > > counts files. > > > > samtools view -h file.sam.bam | python dexseq_count.py file.gff - > > file.counts > > > > Do you think that the truncation may be behind the problems? > > > > I can't check the structure of my SAM.BAM file because its all symbols, > > though when I use samtools view, the contents runs across the screen, > and I > > am unable to pinpoint any irregularity. > > > > Thanks, > > Margaret > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
That's a warning, not an error and you can likely ignore it. As Simon alluded in his earlier reply, the script expects mates in a pair to follow one another. When they don't, the warning you observed is issued. With tophat, this can occur when (1) you don't specify --no- mixed or (2) you specify --fusion-search. Tophat is known to not always produce proper flags. Regards, Devon ____________________________________________ Devon Ryan, Ph.D. Email: dpryan at dpryan.com Tel: +49 (0)178 298-6067 Molecular and Cellular Cognition Lab German Centre for Neurodegenerative Diseases (DZNE) Ludwig-Erhard-Allee 2 53175 Bonn, Germany On Oct 7, 2013, at 12:21 AM, Margaret Linan wrote: > After properly sorting, using Simons method and then using the dexseq count program I get error messages > > These are a sample of the error messages: > > UserWarning: Read HWI-ST522:121:D0V1CACXX:8:1106:20332:74088 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?) > "which could not be found. (Is the SAM file properly sorted?)" ) > /packages/python/python-2.7.3/lib/python2.7/site- packages/HTSeq/__init__.py:598: UserWarning: Read HWI- ST522:121:D0V1CACXX:8:1106:21152:96414 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?) > "which could not be found. (Is the SAM file properly sorted?)" ) > /packages/python/python-2.7.3/lib/python2.7/site- packages/HTSeq/__init__.py:598: UserWarning: Read HWI- ST522:121:D0V1CACXX:8:1106:2563:90289 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?) > "which could not be found. (Is the SAM file properly sorted?)" ) > /packages/python/python-2.7.3/lib/python2.7/site- packages/HTSeq/__init__.py:598: UserWarning: Read HWI- ST522:121:D0V1CACXX:8:1106:2842:88054 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?) > "which could not be found. (Is the SAM file properly sorted?)" ) > /packages/python/python-2.7.3/lib/python2.7/site- packages/HTSeq/__init__.py:598: UserWarning: Read HWI- ST522:121:D0V1CACXX:8:1106:6895:17883 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?) > "which could not be found. (Is the SAM file properly sorted?)" ) > /packages/python/python-2.7.3/lib/python2.7/site- packages/HTSeq/__init__.py:598: UserWarning: Read HWI- ST522:121:D0V1CACXX:8:1106:7069:24372 claims to have an aligned mate which could not be found. (Is the SAM file properly sorted?) > "which could not be found. (Is the SAM file properly sorted?)" ) > Thanks, > Margaret > > > On Sun, Oct 6, 2013 at 2:29 PM, Devon Ryan <dpryan at="" dpryan.com=""> wrote: > It would be helpful if you reported the actual error messages. > > ____________________________________________ > Devon Ryan, Ph.D. > Email: dpryan at dpryan.com > Tel: +49 (0)178 298-6067 > Molecular and Cellular Cognition Lab > German Centre for Neurodegenerative Diseases (DZNE) > Ludwig-Erhard-Allee 2 > 53175 Bonn, Germany > > On Oct 6, 2013, at 10:51 PM, Margaret Linan wrote: > > > Hi Simon, > > > > It appears that one of my regular unsorted SAM files is truncated. > > > > Though the sort works fine and samtools view does not detect truncation > > in my file.sam.bam, I still get error messages after attempting to generate > > counts files. > > > > samtools view -h file.sam.bam | python dexseq_count.py file.gff - > > file.counts > > > > Do you think that the truncation may be behind the problems? > > > > I can't check the structure of my SAM.BAM file because its all symbols, > > though when I use samtools view, the contents runs across the screen, and I > > am unable to pinpoint any irregularity. > > > > Thanks, > > Margaret > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD REPLY
0
Entering edit mode
Simon Anders ★ 3.7k
@simon-anders-3855
Last seen 3.7 years ago
Zentrum für Molekularbiologie, Universi…
Hi Margaret On 06/10/13 22:51, Margaret Linan wrote: > It appears that one of my regular unsorted SAM files is truncated. If the SAM file is unsorted, this might be annoying but okay. If it is sorted by position, you have a problem: All genes on chromosomes with high numbers will be missing and hence wrongly appear downregulated in this sample. > Though the sort works fine and samtools view does not detect truncation > in my file.sam.bam, I still get error messages after attempting to generate > counts files. > > samtools view -h file.sam.bam | python dexseq_count.py file.gff - > file.counts > > Do you think that the truncation may be behind the problems? Sure, if the truncation causes some mates to be missing, htseq-count will complain about it. Whether you need to worry about it depends on whether it affects only a small fraction of the reads or more. > I can't check the structure of my SAM.BAM file because its all symbols, > though when I use samtools view, the contents runs across the screen, and I > am unable to pinpoint any irregularity. You can write the bam file back into a sam file ("samtools view abc_sorted.bam > abc_sorted.sam") to inspect it. Simon
ADD COMMENT

Login before adding your answer.

Traffic: 506 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6