Using merged alignments of PE readlists for DEXSeq
1
0
Entering edit mode
arom2 • 0
@arom2-8204
Last seen 7.6 years ago
United States

To Whom It May Concern,

I have a question regarding the script 'dexseq_count.py' from the DEXSeq package. My sequences were generated from a paired end sequencing project. My SAM files are the result of a merger of three sorted alignment files per sample. There are three files for each sample that are produced after adapter/quality trimming: one large read list of sequences with the interweaved R1 and R2 partners as well as one read list for sequences with only the R1 remaining and one read list for sequences with only the R2 remaining. Using the aligner bowtie2 separately on all three read lists, the merged file utilizes the most information from my samples with both the preserved paired reads as well as orphaned paired read alignments.

Will this cause any issues when using the count script with the argument “-p yes”? Will the program count all the reads equally or account for the orphans from paired in an appropriate way? Otherwise, would you recommend discarding the orphaned reads from the analysis (they are usually less than 3% of total reads for each sample).

Thank you in advance, arom

dexseq paired rnaseq samtools merge • 1.5k views
ADD COMMENT
0
Entering edit mode
Alejandro Reyes ★ 1.9k
@alejandro-reyes-5124
Last seen 3 months ago
Novartis Institutes for BioMedical Rese…

Hi @arom2,

If you input only one mate into the aligner, the aligner will think these are single-end reads. The corresponding file will have the flags from a single-end mapping and this will very likely cause problems if you specify "-p yes" with the script dexseq_count.py.

Something worth considering is that this script counts fragments, not reads. Thus, a pair of mated reads that come from the same sequenced fragment will be counted only once. If you count separately on R1 and R2, where a pair of R1 and R2 are coming from the same fragment, you might double count some fragments. If you are sure this does not happen, I would suggest to run the mated read alignments using "-p yes" and the unmate read alignments using "-p no" and then sum the counts.

Alejandro

ADD COMMENT
0
Entering edit mode

Thank you, that is exactly what I was worried about. 

ADD REPLY

Login before adding your answer.

Traffic: 638 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6