Entering edit mode
Hi Ryan,
Sorry for my late reply. We have added the following options to
featureCounts to let users be able to extend reads and also control
the overlap length between reads and featureCounts:
readExtension5
readExtension3
minReadOverlap
The featureCounts help page describes the meaning of these parameters
and how to use them.
Changes have been committed to bioc devel and they should be available
to you in a couple of days (Rsubread 1.15.1).
Wei
On Apr 8, 2014, at 4:58 PM, Ryan wrote:
> Hi Wei,
>
>> I'm not entirely sure what you are trying to do. But would
extending the genomic regions you use in your summarization achieve
the same effect?
>
> No, that would effectively extend both ends of each read
symmetrically. I want to keep the 5-prime position of the read the
same, but change the length. So if the effective fragment length was
set to 150, then a 100-bp read mapped in the forward direction at
position 500 would overlap a peak that starts at 625, but it would not
overlap a peak that ends at 475.
>
>> For your second request, maybe you can do a filtering after you get
the read counts, which is pretty straightforward to do?
>
> I think you've misunderstood what I'm asking here. It's kind of hard
to explain in words. I mean that currently, if there is even 1 bp of
overlap between a read and a feature, featureCounts will count it. I'm
saying that it would be nice to be able to be more stringent by
requiring more than 1 bp of overlap. E.g. require 50 bp of overlap for
a 100bp read to count it, or even count only reads that fall
completely within a feature (i.e. 100% overlap).
>
> Now that I think about it, I could implement the first request and
part of the second one if I could provide the reads in e.g. a GRanges
object or a text file that just has columns for chromosome, start,
end, and strand (or a bed file, etc.). Then I could pre-process my
reads to adjust the fragment lengths however I want. However, the
featureCounts help indicates that bam (or sam) is the only acceptable
input format. Is this correct, or is there another way to provide the
input reads?
>
> -Ryan
>
>> On Apr 8, 2014, at 11:19 AM, Ryan C. Thompson wrote:
>>
>>> Hello,
>>>
>>> I would like to request a simple feature for Rsubread's
featureCounts function that would make it more useful for ChIP-Seq
applications. I want to use featureCounts to count the number of reads
falling in each of my called peaks. However, each read represents a
DNA fragment of a specific length, which can be estimated by cross-
strand correlation analysis or known a priori. In my case, it is the
length of one nucleosome, i.e. 147 bp. So I would like to treat each
read as being 147 bp long for the purpose of computing overlaps, since
the number of bp sequenced is not representative of the fragment
length. Would it be possible to add a parameter to featureCounts to
allow this adjustment? Also, an additional feature that would be nice
to have, but is less important, would be the ability to require that a
certain percentage of a read overlaps a feature before counting it.
>>>
>>> Thanks for listening,
>>>
>>> -Ryan Thompson
>>
>>
______________________________________________________________________
>> The information in this email is confidential and intended solely
for the addressee.
>> You must not disclose, forward, print or use it without the
permission of the sender.
>>
______________________________________________________________________
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:6}}