Question

how to set strandMode in summarizeOverlaps()?

0

Entering edit mode

lizd • 0

@lizd-22727

Last seen 4.3 years ago

i have strand-specific(dUTP) RNAseq reads, and i count it use GenomicAlignments::summarizeOverlaps() with the paramters below:

flag <- scanBamFlag(isSecondaryAlignment = FALSE, isNotPassingQualityControls = FALSE, isUnmappedQuery = FALSE) sbp <- ScanBamParam(flag=flag, mapqFilter = 255)

geneCnt <- summarizeOverlaps(features = ebg, mode = "Union", reads = SE1bamlist, ignore.strand = FALSE, inter.feature = FALSE, singleEnd = TRUE, strandMode = 2, param = sbp, preprocess.reads = NULL)

but i got an error: Error in FUN(bf, param = param, ...) : unused argument (strandMode = 2) Calls: summarizeOverlaps ... tryCatch -> tryCatchList -> tryCatchOne -> <anonymous>

so, can someone tell me how to set strand-specific parameters?

thank you!

software error strand-specific • 729 views

ADD COMMENT • link updated 4.3 years ago by Robert Castelo ★ 3.2k • written 4.3 years ago by lizd • 0

score 0 · Answer 1 · 2020-01-20

0

Entering edit mode

Robert Castelo ★ 3.2k

@rcastelo

Last seen 7 hours ago

Barcelona/Universitat Pompeu Fabra

hi,

you have not provided your session information but it may be that you're using an old version of Bioconductor and, more specficifically, of the GenomicAlignments package. As shown in this thread, a couple of years back it was not possible to pass the option strandMode to summarizeOverlaps(), but this was fixed and this is now possible since a couple of Bioconductor releases.

by the way, the comment by user ATpoint is wrong, not only about the fact that the strandMode option exists, but also about the fact that the infrastructure provided by the GenomicAlignments package and, more concretely, when using the function summarizeOverlaps(), does not require loading an entire BAM file into R.

let's say you have a number of BAM files in your current path from which you want to calculate your table of counts, you can create a BAMFileList object that will tell the engine reading each BAM file to stream over it file in chunks of size, let's say 1M reads:

files <- list.files(path=".", pattern="[0-9].bam$", full.names=TRUE)
bfl <- BamFileList(files, asMates=TRUE, yieldSize=1000000)

here i'm also indicating that i'm dealing with paired-end reads (asMates=TRUE). then you can pass the bfl object as reads argument to the summarizeOverlaps() function.

cheers,

robert.

ADD COMMENT • link 4.3 years ago Robert Castelo ★ 3.2k

1

Entering edit mode

To add to Robert's answer, the strandMode argument isn't actually an argument to summarizeOverlaps, but instead is an argument for readGAlignmentPairs, which is the underlying function that reads in the data, and is passed in via the ellipsis ... argument.

There isn't a strandMode argument for readAlignments, which is the function that is called when you specify singleEnd = TRUE, because it doesn't make sense in that context. From ?strandMode:

 strandMode(x) ,  strandMode(x) <- value : The _strand mode_ is a
          per-object switch on GAlignmentPairs objects that controls
          the behavior of the  strand  getter.  More precisely, it
          indicates how the strand of a pair should be inferred from
          the strand of the first and last alignments in the pair:

              0: strand of the pair is always *.

              1: strand of the pair is strand of its first alignment.
              This mode should be used when the paired-end data was
              generated using one of the following stranded protocols:
              Directional Illumina (Ligation), Standard SOLiD.

              2: strand of the pair is strand of its last alignment.
              This mode should be used when the paired-end data was
              generated using one of the following stranded protocols:
              dUTP, NSR, NNSR, Illumina stranded TruSeq PE protocol.

If you have single-end data, there is no need to infer the strand, because the strand is obvious, given the strand that the read aligns to.

ADD REPLY • link 4.2 years ago James W. MacDonald 65k

0

Entering edit mode

Thanks for your reply, but I still confused that whether strand-specific SINGLE-END reads can be counted like PAIRED reads by GenomicAlignments? There is no limitation of library layout(SINGLE/PAIRED) in Rsubread.

in ?strandMode: These modes are equivalent to strandSpecific equal 0, 1, and 2, respectively, for the featureCounts function defined in the Rsubread package.

in ?Rsubread: Indicate if strand-specific read counting should be performed. A single integer value (applied to all input files) or a string of comma-separated values (applied to each corresponding input file) should be provided. Possible values include: 0 (unstranded), 1 (stranded) and 2 (reversely stranded). Default value is 0 (ie. unstranded read counting carried out for all input files). For paired-end reads, strand of the first read is taken as the strand of the whole fragment. FLAG field is used to tell if a read is first or second read in a pair. Value of isStrandSpecific parameter in Rsubread featureCounts is a vector which has a length of either 1, or the same with the total number of input files provided.

PS: my BiocManager version "1.30.10"

Thank you!

ADD REPLY • link 4.2 years ago lizd • 0