isSecondaryAlignment=FALSE with BWA MEM mapped BAM files not working?
1
0
Entering edit mode
sbcn ▴ 70
@sbcn-4752
Last seen 1 day ago
Spain

Hello everyone,

I am trying to retrieve uniquely mapped reads from a BAM file (paired-end) mapped with BWA MEM, with the scanBam function from Rsamtools:

# setting isSecondaryAlignment=FALSE
param <- ScanBamParam(flag=scanBamFlag(isUnmappedQuery=FALSE, isSecondaryAlignment=FALSE), what=c("qname", "pos", "qwidth", "rname"))

# scanning BAM


The problem is that there remain some secondary alignments:

# testing whether some read IDs are present more than twice (because it is paired-end data, we expect them to be present exactly 2 times if only uniquely mapped reads remain):
any(table(aln\$qname) > 2)
# TRUE


I read that I could also use the "mapqFilter" in ScanBamParam, but I also read that it might not be the most reliable way to proceed. Also, I am sure which MAPQ filter to use with BWA MEM...

The following is working with samtools (outside of R), which I found in this post:

samtools view -F 2048 -bo filtered.bam original.bam


Is there a way to do the same think with Rsamtools, i.e. using the appropriate SAM flag (2048 corresponds to any secondary alignment).

Either that, or someone can advice a robust alternative solution to this problem (or point me to something I might be doing wrong)?

Thank you!

bwa Rsamtools • 76 views
2
Entering edit mode
Robert Castelo ★ 2.7k
@rcastelo
Last seen 2 days ago
Barcelona/Universitat Pompeu Fabra

Hi,

To use the supplementary alignment flag, simply add isSupplementaryAlignment=FALSE to your call to scanBamFlag(). Other flags that you may want to consider adding are:

• isUnmappedQuery=FALSE
• isProperPair=TRUE
• isDuplicate=FALSE
• isNotPassingQualityControls=FALSE

You may consult the help page of scanBamFlag for full details on how to use these and other flags.

cheers,

robert.

0
Entering edit mode

Thank you, Robert! So if I want to retrieve uniquely mapped reads only, it is correct to set: isUnmappedQuery=FALSE + isSupplementaryAlignment=FALSE + isSecondaryAlignment=FALSE (I also want to keep those reads that don't map as a proper pair)?
I am still a bit confused with the secondary versus supplementary alignments: do supplementary alignments include secondary alignments?
In case of keeping only uniquely mapped reads, I suppose it is correct to remove both.

0
Entering edit mode

Well, how to retrieve uniquely mapped reads only, using BWA, is more tricky than it sounds, and I guess the most authoritative answer to date is this one by Heng Li, the author of BWA, where, in essence, he argues that the concept itself of uniquely mapable read is difficult to define and if you can't define it precisely, then it becomes even harder to measure it and recommends using mapping qualities. I'm not a BWA expert user, but Googling a bit you can find the following relevant Biostars hits for you to read about it here, here and here.

The difference between secondary and supplementary alignments you can find it described in the SAM specifications here and essentially, while a secondary alignment is an alignment of a read that, according to the read mapper, maps to a secondary location in the genome, a supplementary alignment is a part of a read that maps to a location of the genome distant to the other part, as a result of, for instance, some structural variation. If you want to work with primary alignments, then yes, I'd suggest to set those flags you mentioned to FALSE.

0
Entering edit mode

Thanks again! I found the same posts, it's all a little confusing. Nevertheless, I think those flags will do just find for the time being, for the analysis I want to perform in R. Cheers