reads alignment statistics
2
1
Entering edit mode
C T ▴ 140
@c-t-5858
Last seen 10 months ago
United States

Hello...

Is there an R package that can calculate alignment statistics from bam/sam files? By alignment statistics I mean number of reads, number of reads uniquely mapped, number of reads mapped to multiple locations.

I found rsubread package can count number of  reads that can be mapped. However, it doesn't differentiate between unique alignment and multiple alignment.

Thanks in advance!

rsubread bioconductor • 2.3k views
ADD COMMENT
2
Entering edit mode
@james-w-macdonald-5106
Last seen 1 hour ago
United States

Yes, countBam in Rsamtools will do that sort of thing. See in particular ?ScanBamParam for filtering what you do and do not count.

ADD COMMENT
0
Entering edit mode

Thanks for the hints on countBam. However, I still cannot get the number of uniquely mapped reads.

I thought the following should return number of multiple mapped reads. However, it returns 0 when I know there are multiple mapped reads. bowtie2 shows ~14 millions reads

file='input.bam'

countBam(file,index=file,param=ScanBamParam(flag=scanBamFlag(isSecondaryAlignment=TRUE))

I was able to get the number of unmapped correctly using isUnmappedQuery=TRUE

 

 

ADD REPLY
1
Entering edit mode

Works for me:

samtools flagstat 173342.accepted_hits.merged.markeddups.recal.bam
82461241 + 0 in total (QC-passed reads + QC-failed reads)
24242590 + 0 secondary

> z <- countBam("173342.accepted_hits.merged.markeddups.recal.bam", param = ScanBamParam(flag = scanBamFlag(isSecondaryAlignment = TRUE)))

> z$records
[1] 24242590
ADD REPLY
1
Entering edit mode

Thanks for your help. I figured it out.

I ran bowtie retaining only unique mapped reads. Therefore, my bam file does not contain multiple mapped reads. Duh!

James, thanks again for your help.

ADD REPLY
1
Entering edit mode
@herve-pages-1542
Last seen 23 hours ago
Seattle, WA, United States

Hi,

The quickBamFlagSummary() function in Rsamtools will display various stats about the FLAG field. For example:

> quickBamFlagSummary(bamfile)
                                 |          |         | mean / max
                           group |    nb of |   nb of | records
                              of |  records |  unique | per unique
                         records | in group |  QNAMEs | QNAME
All records................... A |     3307 |    1699 | 1.95 / 2
 o template has single segment S |        0 |       0 |   NA / NA
 o template has multple sgmnts M |     3307 |    1699 | 1.95 / 2
     - first segment.......... F |     1654 |    1654 |    1 / 1
     - last segment........... L |     1653 |    1653 |    1 / 1
     - other segment.......... O |        0 |       0 |   NA / NA

Note that (S, M) is a partitioning of A, and (F, L, O) is a
partitioning of M. Indentation reflects this.

Details for group M:
  o record is mapped......... M1 |     3271 |    1699 | 1.93 / 2
      - primary alignment.... M2 |     3271 |    1699 | 1.93 / 2
      - secondary alignment.. M3 |        0 |       0 |   NA / NA
  o record is unmapped....... M4 |       36 |      36 |    1 / 1

Details for group F:
  o record is mapped......... F1 |     1641 |    1641 |    1 / 1
      - primary alignment.... F2 |     1641 |    1641 |    1 / 1
      - secondary alignment.. F3 |        0 |       0 |   NA / NA
  o record is unmapped....... F4 |       13 |      13 |    1 / 1

Details for group L:
  o record is mapped......... L1 |     1630 |    1630 |    1 / 1
      - primary alignment.... L2 |     1630 |    1630 |    1 / 1
      - secondary alignment.. L3 |        0 |       0 |   NA / NA
  o record is unmapped....... L4 |       23 |      23 |    1 / 1

This output shows that the reads are paired-end. This file does not contain multiple mapped reads because the number of secondary alignments is 0. This is confirmed by the max records per unique QNAME equal to 1 for the F and L groups.

H.

ADD COMMENT

Login before adding your answer.

Traffic: 747 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6