Entering edit mode
                    Hi,
        I am using countOverlaps function in GenomicFeatures package
to count how many sequencing reads are mapped to each genomic feature
(e.g. promoter region, gene region etc.). However I found that if a
read was mapped to more than one features, then the count of each
feature will be increased by 1, i.e. this read was counted more than
one times although it can only originate from one genomic location.
Below is a simple example to show this:
-------------------------
features <- GRanges(seqnames="chr1",
ranges=IRanges(c(100,500),c(1000,1500)) ,strand=c("+","+"))
features
GRanges with 2 ranges and 0 elementMetadata values
    seqnames      ranges strand |
       <rle>   <iranges>  <rle> |
[1]     chr1 [100, 1000]      + |
[2]     chr1 [500, 1500]      + |
seqlengths
 chr1
   NA
reads <- GRanges(seqnames="chr1",ranges=IRanges(c(150,600),c(250,700))
,strand=c("+","+"))
GRanges with 2 ranges and 0 elementMetadata values
    seqnames     ranges strand |
       <rle>  <iranges>  <rle> |
[1]     chr1 [150, 250]      + |
[2]     chr1 [600, 700]      + |
seqlengths
 chr1
   NA
countOverlaps(features,reads)
[1] 2 1
--------------------
I think It will be a better to have an argument in countOverlaps
function which specifies whether a single hit (eg. a random one) or
multiple hits should be returned for each query sequence.
Below is my session info:
sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base
other attached packages:
[1] GenomicFeatures_1.2.3 GenomicRanges_1.2.3   IRanges_1.8.9
loaded via a namespace (and not attached):
 [1] Biobase_2.10.0     biomaRt_2.5.1      Biostrings_2.18.2
BSgenome_1.18.3    DBI_0.2-5          RCurl_1.4-3
 [7] RSQLite_0.9-2      rtracklayer_1.10.6 tools_2.12.0
XML_3.2-0
Cheers,
Wei
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:6}}
                    
                
                