Normalization in Coverage
2
0
Entering edit mode
rohan bareja ▴ 200
@rohan-bareja-4905
Last seen 6.7 years ago
Hi everyone, I am looking at coverage in certain continuous intervals of a gene.I have got this Views object. However Could I do normalization of the reads  in this object which is very necessary in doing any analysis further? ir <- IRanges(c(90645249,90645349,90645449,90645549,90645649,90645749, 90645849,90645949,90646049,90646149,90646249,90646349,90646449,9064654 9,90646649,90646749, 90646849,90646949,90647049,90647149,90647249,90647349,90647449,9064754 9,90647649,90647749),width=100) gr <- GRanges(seqnames = "chr4", ranges = ir,strand = "-") grl<- as(gr,"RangesList") cov4 <- cover[29] ##subsetting coverage for chr4 vl <- Views(cov4,grl) vl SimpleRleViewsList of length 1 $chr4 Views on a 191154276-length Rle subject views:         start      end width  [1] 90645249 90645348   100 [  2  10  33  41  52  81  99 117 132 157 223 ...]  [2] 90645349 90645448   100 [1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]  [3] 90645449 90645548   100 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]  [4] 90645549 90645648   100 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]  [5] 90645649 90645748   100 [1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 ...]  [6] 90645749 90645848   100 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]  [7] 90645849 90645948   100 [1 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 ...]  [8] 90645949 90646048   100 [4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 ...]  [9] 90646049 90646148   100 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]  ...      ...      ...   ... ... [18] 90646949 90647048   100 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] [19] 90647049 90647148   100 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] [20] 90647149 90647248   100 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] [21] 90647249 90647348   100 [433 433 433 433 433 433 433 433 433 433 433 ...] [22] 90647349 90647448   100 [40 40 40 40 40 40 40 40 40 40 37 35 34 28 ...] [23] 90647449 90647548   100 [4 4 4 4 4 4 3 3 2 1 0 0 0 0 0 0 0 0 0 0 0 0 ...] [24] 90647549 90647648   100 [27 27 27 27 27 27 27 27 26 25 24 24 24 23 ...] [25] 90647649 90647748   100 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] [26] 90647749 90647848   100 [2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 0 0 0 0 0 0 ...] I would really appreciate any ideas.. Thanks, Rohan [[alternative HTML version deleted]]
Coverage Normalization Coverage Normalization • 1.4k views
ADD COMMENT
0
Entering edit mode
@valerie-obenchain-4275
Last seen 2.3 years ago
United States
It sounds like you want to apply a normalization function to each of the coverage views. This can be done with sapply (or lapply). If you have a function called 'normalize' you want to apply to the coverage, sapply(vl, normalize) See ?sapply or try a simple example, sapply(vl, sum) to get an idea of how it works. Valerie On 11/02/2011 12:24 PM, rohan bareja wrote: > > > > Hi everyone, > > I am looking at coverage in certain continuous intervals of a gene.I have got this Views object. > However Could I do normalization of the reads in this object which is very necessary in doing any analysis further? > > ir<- IRanges(c(90645249,90645349,90645449,90645549,90645649,90645749 ,90645849,90645949,90646049,90646149,90646249,90646349,90646449,906465 49,90646649,90646749, > 90646849,90646949,90647049,90647149,90647249,90647349,90647449,90647 549,90647649,90647749),width=100) > > gr<- GRanges(seqnames = "chr4", ranges = ir,strand = "-") > grl<- as(gr,"RangesList") > > > cov4<- cover[29] ##subsetting coverage for chr4 > > vl<- Views(cov4,grl) > > vl > SimpleRleViewsList of length 1 > $chr4 > Views on a 191154276-length Rle subject > > views: > start end width > [1] 90645249 90645348 100 [ 2 10 33 41 52 81 99 117 132 157 223 ...] > [2] 90645349 90645448 100 [1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] > [3] 90645449 90645548 100 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] > [4] 90645549 90645648 100 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] > [5] 90645649 90645748 100 [1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 ...] > [6] 90645749 90645848 100 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] > [7] 90645849 90645948 100 [1 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 ...] > [8] 90645949 90646048 100 [4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 ...] > [9] 90646049 90646148 100 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] > ... ... ... ... ... > [18] 90646949 90647048 100 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] > [19] 90647049 90647148 100 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] > [20] 90647149 90647248 100 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] > [21] 90647249 90647348 100 [433 433 433 433 433 433 433 433 433 433 433 ...] > [22] 90647349 90647448 100 [40 40 40 40 40 40 40 40 40 40 37 35 34 28 ...] > [23] 90647449 90647548 100 [4 4 4 4 4 4 3 3 2 1 0 0 0 0 0 0 0 0 0 0 0 0 ...] > [24] 90647549 90647648 100 [27 27 27 27 27 27 27 27 26 25 24 24 24 23 ...] > [25] 90647649 90647748 100 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] > [26] 90647749 90647848 100 [2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 0 0 0 0 0 0 ...] > > > I would really appreciate any ideas.. > > > Thanks, > Rohan > [[alternative HTML version deleted]] > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
rohan bareja ▴ 200
@rohan-bareja-4905
Last seen 6.7 years ago
Hi Valerie, Thats what I wanted to do but I dont know how to write a normalization function for Coverage.. Would it be somewhat calculating RPKMs ??If yes how should I do that? Thanks, Rohan [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
How to best normalize your data depends on the particulars of your experiment (number of replicates, high/low counts) and what you are trying to achieve (differential expression). I don't believe there is a universally accepted best way and it is a topic of much discussion. If you are able to work with counts instead of coverage you could look at DESeq and edgeR, both of which offer normalization methods. The package vignettes explain the methods and assumptions in detail. Here is a recent discussion thread on this topic. There are others .. https://stat.ethz.ch/pipermail/bioconductor/2011-May/039235.html Valerie On 11/09/2011 12:09 PM, rohan bareja wrote: > Hi Valerie, > > Thats what I wanted to do but I dont know how to write a normalization function for Coverage.. > Would it be somewhat calculating RPKMs ??If yes how should I do that? > > Thanks, > Rohan > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 938 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6