Question

Rsubread subjunc reporting 255 kb introns with exon limits mapping to A-rich region

0

Entering edit mode

Cosmin Saveanu • 0

@aa83cbd1

Last seen 2.5 years ago

France

Hello. I'm using Rsubread to align short stranded Illumina RNA seq reads to the yeast (S. cerevisiae) genome). The BAM file generated using subjunc shows quite a large number of reads that are erroneously mapped across very long distances, considered introns. One of the mapped ends often corresponds to stretches of "A". I was wondering if there is an option that could limit the size of the detected introns or filter out reads that were partially mapped in low complexity genomic regions ?

Thank you very much.

The test command used (using indexed yeast genome file from Ensembl and the corresponding annotation file):


subjunc(index="yeast110", readfile1="RAW/my.fastq.gz", 
                     output_file = "BAM/my.bam", 
                     output_format = "BAM",
                     nthreads = 8,
                     sortReadsByCoordinates = TRUE,
                     annot.ext = "Saccharomyces_cerevisiae.R64-1-1.110.gtf.gz",
                     isGTF = T,
                     useAnnotation = T)

EDIT: since STAR has a specific parameter that allows to define maximum acceptable intron size, I'm switching back to it. I liked the idea of using only R for the whole workflow, but it's not a big problem to switch to a few bash commands.

Rsubread • 761 views

ADD COMMENT • link updated 2.5 years ago by Gordon Smyth 53k • written 2.5 years ago by Cosmin Saveanu • 0