Question

featureCounts read summarization function

0

Entering edit mode

Wei Shi ★ 3.6k

@wei-shi-2183

Last seen 3 months ago

Australia/Melbourne

Hi Sheng, Please keep your post on the list. This is a rather arbitrary choice. You may play with different cutoffs to see what the difference it makes for your data, but a cutoff of 5 or less seems to be quite low to me. I dont really think genes with only 5 reads or less are of interest. Best wishes, Wei On Jul 6, 2013, at 7:38 AM, Sheng Zhao wrote: > Hi Wei, > > I have one question about the case study at http://bioinf.wehi.edu.au/RNAseqCaseStudy/. > > In this example, you filtered genes with less than 10 reads per million mapped reads. Is there any special reason for this setting? or why not 5 ... or 2..? > > Thank you for your help and time. > > Regards, > Sheng > > > > > On Tue, Jul 2, 2013 at 2:02 AM, Wei Shi <shi@wehi.edu.au> wrote: > Dear All, > > I would like to formally introduce to you the featureCounts function included in the Rsubread package. featureCounts is R function designed for summarizing sequencing reads to genomic features such as genes, exons and promoters. It is a light-weight general-purpose read counting program (essentially written in C), and it has the following features: > (1) It performs precise read assignments by taking care of indels, junctions and fusions in the reads. > (2) It takes less than 4 minutes to summarize 20 million pairs of reads to 26k RefSeq genes using one thread, and only uses 40MB of memory (you can even run it on a Mac laptop). > (3) It supports multi-threaded running. > (4) It supports GTF format annotation and SAM/BAM read data. > (5) It supports strand-specific read summarization. > (6) It can perform read summarization at both feature level (eg. exons) and meta-feature level (eg. genes). > (7) It allows users to specify whether reads overlapping with more than one feature should be counted or not. > (8) It gives users full control on the summarization of paired-end reads, including allowing them to check if both ends are mapped and/or if the paired-end distances satisfy the distance criteria. > (9) It discriminates the features, which were overlapped by both ends from the same fragment, from those which were overlapped by only one end so as to get more fragments counted. > (10) It allows users to specify whether chimeric fragments should be counted. > (11) It can exclude multi-mapping reads and reads with low mapping quality scores from summarization. > > To use this function, make sure you are using the latest version of Rsubread (1.10.5 in the release branch). > > A technical report for featureCounts can be found here - http://arxiv.org/abs/1305.3347. You may also refer to the Rsubread users guide for some details about this function (typing 'RsubreadUsersGuide()' in your R session). > > To see how featureCounts can be used in an RNA-seq analysis pipeline, you may have a look at this case study - http://bioinf.wehi.edu.au/RNAseqCaseStudy . This case study will also be used in a Workshop in the incoming Bioc2013 meeting. > > Hope you find it useful. > > Best wishes, > > Wei > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:20}}

Sequencing Annotation Rsubread Sequencing Annotation Rsubread • 2.4k views

ADD COMMENT • link 11.8 years ago Wei Shi ★ 3.6k