featureCounts read summarization function
0
0
Entering edit mode
Wei Shi ★ 3.6k
@wei-shi-2183
Last seen 13 hours ago
Australia/Melbourne
Hi Sheng, Please keep your post on the list. This is a rather arbitrary choice. You may play with different cutoffs to see what the difference it makes for your data, but a cutoff of 5 or less seems to be quite low to me. I dont really think genes with only 5 reads or less are of interest. Best wishes, Wei On Jul 6, 2013, at 7:38 AM, Sheng Zhao wrote: > Hi Wei, > > I have one question about the case study at http://bioinf.wehi.edu.au/RNAseqCaseStudy/. > > In this example, you filtered genes with less than 10 reads per million mapped reads. Is there any special reason for this setting? or why not 5 ... or 2..? > > Thank you for your help and time. > > Regards, > Sheng > > > > > On Tue, Jul 2, 2013 at 2:02 AM, Wei Shi <shi@wehi.edu.au> wrote: > Dear All, > > I would like to formally introduce to you the featureCounts function included in the Rsubread package. featureCounts is R function designed for summarizing sequencing reads to genomic features such as genes, exons and promoters. It is a light-weight general-purpose read counting program (essentially written in C), and it has the following features: > (1) It performs precise read assignments by taking care of indels, junctions and fusions in the reads. > (2) It takes less than 4 minutes to summarize 20 million pairs of reads to 26k RefSeq genes using one thread, and only uses 40MB of memory (you can even run it on a Mac laptop). > (3) It supports multi-threaded running. > (4) It supports GTF format annotation and SAM/BAM read data. > (5) It supports strand-specific read summarization. > (6) It can perform read summarization at both feature level (eg. exons) and meta-feature level (eg. genes). > (7) It allows users to specify whether reads overlapping with more than one feature should be counted or not. > (8) It gives users full control on the summarization of paired-end reads, including allowing them to check if both ends are mapped and/or if the paired-end distances satisfy the distance criteria. > (9) It discriminates the features, which were overlapped by both ends from the same fragment, from those which were overlapped by only one end so as to get more fragments counted. > (10) It allows users to specify whether chimeric fragments should be counted. > (11) It can exclude multi-mapping reads and reads with low mapping quality scores from summarization. > > To use this function, make sure you are using the latest version of Rsubread (1.10.5 in the release branch). > > A technical report for featureCounts can be found here - http://arxiv.org/abs/1305.3347. You may also refer to the Rsubread users guide for some details about this function (typing 'RsubreadUsersGuide()' in your R session). > > To see how featureCounts can be used in an RNA-seq analysis pipeline, you may have a look at this case study - http://bioinf.wehi.edu.au/RNAseqCaseStudy . This case study will also be used in a Workshop in the incoming Bioc2013 meeting. > > Hope you find it useful. > > Best wishes, > > Wei > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:20}}
Sequencing Annotation Rsubread Sequencing Annotation Rsubread • 2.4k views
ADD COMMENT

Login before adding your answer.

Traffic: 1464 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6