Search
Question: how can I tell if a RNAseq BAM file is paired end or single end ?
2
gravatar for stephen66
2.9 years ago by
stephen6630
United States
stephen6630 wrote:

Dear All,

I have some RNAseq BAM files left from previous labmates. Is there a quick way to tell are these BAM files paired end or not ? Many Thanks!!

Stephen

ADD COMMENTlink modified 2.9 years ago by Sonali Arora360 • written 2.9 years ago by stephen6630
3
gravatar for Sonali Arora
2.9 years ago by
Sonali Arora360
United States
Sonali Arora360 wrote:

Hi Stephen,

Rsamtools (devel version) has a new function testPairedEndBam which returns TRUE if the BAM file contains paired end reads, FALSE otherwise. 

library(Rsamtools)

fl <- system.file("extdata", "ex1.bam", package="Rsamtools")

testPairedEndBam(fl)
[1] TRUE

packageVersion("Rsamtools")

[1] ‘1.19.23’

- Sonali

 

ADD COMMENTlink modified 2.9 years ago by Dan Tenenbaum ♦♦ 8.2k • written 2.9 years ago by Sonali Arora360
1
gravatar for Martin Morgan
2.9 years ago by
Martin Morgan ♦♦ 20k
United States
Martin Morgan ♦♦ 20k wrote:

Load Rsamtools

library(Rsamtools)

Point to a file, and mark it as a 'BamFile'

## your file here!
fl = system.file(package="Rsamtools", "extdata", "ex1.bam")

And summarize

quickBamFlagSummary(fl)

My original longer answer was to query the file for sequence ranges, and choose a not too large chromosome

bfl = BamFile(fl)
seq = as(seqinfo(bfl), "GRanges")[1]

Input the reads on that chromosome, specifying that you'd like the 'flag' field in addition to the standard coordinates

libary(GenomicAlignments)
aln <- readGAlignments(bfl, param=ScanBamParam(which=seq[1], what="flag"))

Tally the different flags, looking for lots of properly paired reads

colSums(bamFlagAsBitMatrix(mcols(aln)$flag))
ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by Martin Morgan ♦♦ 20k
0
gravatar for James W. MacDonald
2.9 years ago by
United States
James W. MacDonald45k wrote:

Depends on the platform. For Illumina, it's usually a set of files like:

sample_1.1.fastq
sample_1.2.fastq

And if you look at the data inside you will see:

head -n 4 CTATCGCT.AGGATAGG_8.1.fastq
@C5NM5ACXX:8:1101:10000:16673/1
GTCACAGGTCTTGCATAGGTAAACTACTTGGAGGTCAGGGGCTACGTGGT
+
CCCFFFFFFFDHGIJIJIJIFIHJJJIIIIJFIDGHIIHGIIHJIJGIIC
head -n 4 CTATCGCT.AGGATAGG_8.2.fastq
@C5NM5ACXX:8:1101:10000:16673/2
CGGCTTTATTCTTCTATAGTAAGTCTCCCCTCTTTACTGGGGAGGGGGGG
+
CCCFFFFFHGHHGJIJJJJJHIIJJIJIGJJIJIJIJJIIIIHIIIIJDD

Where the @ lines are identical but for the last part.

 

ADD COMMENTlink written 2.9 years ago by James W. MacDonald45k
0
gravatar for Hervé Pagès
2.9 years ago by
Hervé Pagès ♦♦ 13k
United States
Hervé Pagès ♦♦ 13k wrote:

Hi Stephen,

library(Rsamtools)
quickBamFlagSummary(bamfile, main.groups.only=TRUE)
                                group |    nb of |    nb of | mean / max
                                   of |  records |   unique | records per
                              records | in group |   QNAMEs | unique QNAME
All records........................ A |     3307 |     1699 | 1.95 / 2
  o template has single segment.... S |        0 |        0 |   NA / NA
  o template has multiple segments. M |     3307 |     1699 | 1.95 / 2
      - first segment.............. F |     1654 |     1654 |    1 / 1
      - last segment............... L |     1653 |     1653 |    1 / 1
      - other segment.............. O |        0 |        0 |   NA / NA

Note that (S, M) is a partitioning of A, and (F, L, O) is a partitioning of M.
Indentation reflects this.

Look at the nb of records in groups S and M. 0 and 3307 here so it's paired-end.

Cheers,

H.

ADD COMMENTlink written 2.9 years ago by Hervé Pagès ♦♦ 13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 173 users visited in the last hour