Entering edit mode
badribio
•
0
@badribio-12724
Last seen 5.5 years ago
Hi, Trying to process sorted bam (hisat2) alignment, receive following error .local/lib/python2.7/site-packages/HTSeq-0.6.1-py2.7-linuxx86_64.egg/HTSeq/__init__.py:622: UserWarning: 201228244 reads with missing mate encountered. warnings.warn( "%d reads with missing mate encountered." % mate_missing_count[0] ) . I have asked for help on biostar as well https://www.biostars.org/p/142897/ only difference I have used hisat2.
Not sure what is the issue, any help is appreciated. Thanks
Hi, did you check that the read mates had the same read name?
Alejandro
SRR1910473.1.1 99 15 40100294 1 100M = 40100297 103 CCCATATCTTCGAGGCTTTTCCCTACTTTCTCCTCTGTAAGTTTCAGTGTCTCTGGTTTTATGTGGAGTTCCTTAATCCACTTAGATTTGACCTTAGTAC CCCFFFFFHHHHGJIIJJJJJJJJJJJJJJJJJJJJJGIJIGIIJJJIHIIIJJJJBDHGIJJDHGHGDHIIJJJHHHHHGFFDF>CEEEECEEDDCCDC AS:i:0 ZS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:100 YS:i:0 YT:Z:CP XS:A:- NH:i:7
SRR1910473.1.2 147 15 40100297 1 100M = 40100294 -103 ATATCTTCGAGGCTTTTCCCTACTTTCTCCTCTGTAAGTTTCAGTGTCTCTGGTTTTATGTGGAGTTCCTTAATCCACTTAGATTTGACCTTAGTACAAG EDCDDDDDDCDBCCDEEFFFFDHHHEHIIJIJJIJJJIIIIJJIIIJIIIIIJIJIJJJJJJJJJJJJIJJJJGIJFJJJJIJJJIHFHHGHFFFDFCC@ AS:i:0 ZS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:100 YS:i:0 YT:Z:CP XS:A:- NH:i:7
SRR1910473.2.1 83 2 110304369 1 100M = 110304314 -155 TTTTACTAAGTCTGAATATACAGTCTCTGATGTACTATTGCCATAAAGTTAAAAGGCTAGAAGCTAGTCTAAACTGGAAAAATGACAAGTAAGGATGGAT EEEEEFFFFFGFHHHGHGIIIJIJIIGIJJIGHHIIGDIJIJJJJJJJJJJJJJJJJIJJJJJJJJIJJJJJJJJJJJJJJJJJJJJHHHHHFFFFFCC@ AS:i:0 ZS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:100 YS:i:0 YT:Z:CP XS:A:+ NH:i:8
SRR1910473.2.2 163 2 110304314 1 100M = 110304369 155 GGCTGGTCACACTGAAGAAAGGAGAAAAATCATTATGACTCTATTAAATGACTAATTTTACTAAGTCTGAATATACAGTCTCTGATGTACTATTGCCATA @CCFFFEFHHHHHJJJJJJIJJJJJJJJJJIIJJJJJJJJJJJJJJJJJJHIJJJJJJJJJJJJJIIJJJJJJJJJJJGHHHHHHHFFFFFFFFEEEEEE AS:i:0 ZS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:100 YS:i:0 YT:Z:CP XS:A:+ NH:i:8
SRR1910473.3.1 99 1 10809083 1 100M = 10809093 110 GTTAGAATGGCTGGTCACACGGAAGAAAGGAGAAAAATCATTATGACTCTATTAAATGACTAATTTTACTAAGTCTGAATATACAGTCTCTGATGTACTA BCCDDFFFHHHHHJHIJIJJ)3@FHIJJJIGGIGIJIIIIIJJJJJJJJJJJJJJJJIIIJIGIJJFIIJHHHHHCEFFFFFFFEECEEDDCCDCCDDDD AS:i:-2 ZS:i:-2 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:20T79 YS:i:0 YT:Z:CP XS:A:- NH:i:10
SRR1910473.3.2 147 1 10809093 1 100M = 10809083 -110 CTGGTCACACTGAAGAAAGGAGAAAAATCATTATGACTCTATTAAATGACTAATTTTACTAAGTCTGAATATACAGTCTCTGATGTACTATTGCCATAAA EFFFFFFHHHHHHHJJJJJJJJJJJJJIJJJIJHGDJJJJJJIJIHGGGGJJIJIIIIIIJJJJIJIIJIIGIJJHIJJJJIJJJIHHHHHHFFFFFCCB AS:i:0 ZS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:100 YS:i:-2 YT:Z:CP XS:A:- NH:i:10
first few lines of the file, looks like it has extension .1 and .2 Is this the issue (typically it would be /1 and /2)
Thank you.
That's likely the problem, that extension should not be there. I would try to remove that extension and try to run the python script again.
I see thanks, so it should look like this SRR1910473.1 instead of SRR1910473.1.1 correct?
Could you please suggest any scripts to modify that are in line with DEXSeq structure and not mess the bam file.
Thank you for prompt response.