countOverlaps() only count positive strand (from GenomicRanges example)
2
0
Entering edit mode
Song Li ▴ 60
@song-li-4383
Last seen 9.7 years ago
Hi All, I was following the example in the "GenomicRanges Use Cases", section 3, "Simple RNA-seq Analysis", by "copy-paste" command from the document from page 7 to page 8: What I found out is that countOverlap() only count reads aligned to positive strand. There are 28 reads map to region GRList[6645], 13 on positive strand, 15 on negative strand. The "count[6645]" is 13. Is there a way to count both strand? Thanks, Song Li Here are all the code: #-----> Start with code from the package description<-------# > library(Rsamtools) > testFile <- system.file("bam", "isowt5_13e.bam", package = "leeBamViews") > aligns <- readBamGappedAlignments(testFile) > library(GenomicFeatures) > txdb <- makeTranscriptDbFromUCSC(genome = "sacCer2", tablename = "sgdGene") > exonRanges <- exonsBy(txdb, "tx") > length(exonRanges) > levels(rname(aligns)) <- c(paste("chr", as.roman(1:16), + sep = ""), "chrM") > counts <- countOverlaps(exonRanges, aligns) #-----> find alignments that map to GRList$6645<-------# > aligns[705:732] GappedAlignments of length 28 rname strand cigar qwidth start end width ngap [1] chrXIII + 36M 36 804443 804478 36 0 [2] chrXIII + 36M 36 804466 804501 36 0 [3] chrXIII - 36M 36 804473 804508 36 0 [4] chrXIII + 36M 36 804476 804511 36 0 [5] chrXIII - 36M 36 804493 804528 36 0 [6] chrXIII + 36M 36 804516 804551 36 0 [7] chrXIII + 36M 36 804562 804597 36 0 [8] chrXIII + 36M 36 804562 804597 36 0 [9] chrXIII - 36M 36 804574 804609 36 0 ... ... ... ... ... ... ... ... ... [20] chrXIII + 36M 36 804764 804799 36 0 [21] chrXIII - 36M 36 804798 804833 36 0 [22] chrXIII + 36M 36 804799 804834 36 0 [23] chrXIII + 36M 36 804799 804834 36 0 [24] chrXIII - 36M 36 804816 804851 36 0 [25] chrXIII - 36M 36 804947 804982 36 0 [26] chrXIII - 36M 36 804953 804988 36 0 [27] chrXIII - 36M 36 804955 804990 36 0 [28] chrXIII - 36M 36 804974 805009 36 0 > exonRanges[6646] GRangesList of length 1 $6646 GRanges with 1 range and 3 elementMetadata values seqnames ranges strand | exon_id exon_name exon_rank <rle> <iranges> <rle> | <integer> <character> <integer> [1] chrXIII [804455, 805090] + | 7011 NA 1 seqlengths chrIV chrXV chrVII chrXII chrXVI ... chrVI chrI chrM 2micron 1531919 1091289 1090947 1078175 948062 ... 270148 230208 85779 6318 > counts[6646] [1] 13 > table(strand(aligns[705:732])) + - 13 15 > sessionInfo() R version 2.12.0 (2010-10-15) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] Rsamtools_1.2.1 Biostrings_2.18.0 GenomicFeatures_1.2.3 [4] GenomicRanges_1.2.1 IRanges_1.8.2 loaded via a namespace (and not attached): [1] Biobase_2.10.0 biomaRt_2.6.0 BSgenome_1.18.1 DBI_0.2-5 [5] RCurl_1.4-3 RSQLite_0.9-3 rtracklayer_1.10.5 tools_2.12.0 [9] XML_3.2-0 Song Li -- Postdoctoral Associate Institute for Genome Sciences and Policy Duke University
• 1.7k views
ADD COMMENT
0
Entering edit mode
@vincent-j-carey-jr-4
Last seen 1 day ago
United States
On Mon, Dec 13, 2010 at 12:06 PM, Song Li <songli116 at="" gmail.com=""> wrote: > Hi All, > > I was following the example in the "GenomicRanges Use Cases", section > 3, "Simple RNA-seq Analysis", by "copy-paste" command from the > document from page 7 to page 8: > > What I found out is that countOverlap() only count reads aligned to > positive strand. ?There are 28 reads map to region GRList[6645], 13 on > positive strand, 15 on negative strand. ?The "count[6645]" is 13. > > Is there a way to count both strand? If you want to disregard strand of exon range, set it to "*". This does not appear to be a one-liner. > > Thanks, > Song Li > > Here are all the code: > #-----> Start with code from the package description<-------# >> library(Rsamtools) >> testFile <- system.file("bam", "isowt5_13e.bam", package = "leeBamViews") >> aligns <- readBamGappedAlignments(testFile) >> library(GenomicFeatures) >> txdb <- makeTranscriptDbFromUCSC(genome = "sacCer2", tablename = "sgdGene") >> exonRanges <- exonsBy(txdb, "tx") >> length(exonRanges) >> levels(rname(aligns)) <- c(paste("chr", as.roman(1:16), > + ? ? sep = ""), "chrM") >> counts <- countOverlaps(exonRanges, aligns) > > #-----> find alignments that map to GRList$6645<-------# >> aligns[705:732] > GappedAlignments of length 28 > ? ? ? rname strand cigar qwidth ?start ? ?end width ngap > [1] ?chrXIII ? ? ?+ ? 36M ? ? 36 804443 804478 ? ?36 ? ?0 > [2] ?chrXIII ? ? ?+ ? 36M ? ? 36 804466 804501 ? ?36 ? ?0 > [3] ?chrXIII ? ? ?- ? 36M ? ? 36 804473 804508 ? ?36 ? ?0 > [4] ?chrXIII ? ? ?+ ? 36M ? ? 36 804476 804511 ? ?36 ? ?0 > [5] ?chrXIII ? ? ?- ? 36M ? ? 36 804493 804528 ? ?36 ? ?0 > [6] ?chrXIII ? ? ?+ ? 36M ? ? 36 804516 804551 ? ?36 ? ?0 > [7] ?chrXIII ? ? ?+ ? 36M ? ? 36 804562 804597 ? ?36 ? ?0 > [8] ?chrXIII ? ? ?+ ? 36M ? ? 36 804562 804597 ? ?36 ? ?0 > [9] ?chrXIII ? ? ?- ? 36M ? ? 36 804574 804609 ? ?36 ? ?0 > ... ? ? ?... ? ?... ? ... ? ?... ? ?... ? ?... ? ... ?... > [20] chrXIII ? ? ?+ ? 36M ? ? 36 804764 804799 ? ?36 ? ?0 > [21] chrXIII ? ? ?- ? 36M ? ? 36 804798 804833 ? ?36 ? ?0 > [22] chrXIII ? ? ?+ ? 36M ? ? 36 804799 804834 ? ?36 ? ?0 > [23] chrXIII ? ? ?+ ? 36M ? ? 36 804799 804834 ? ?36 ? ?0 > [24] chrXIII ? ? ?- ? 36M ? ? 36 804816 804851 ? ?36 ? ?0 > [25] chrXIII ? ? ?- ? 36M ? ? 36 804947 804982 ? ?36 ? ?0 > [26] chrXIII ? ? ?- ? 36M ? ? 36 804953 804988 ? ?36 ? ?0 > [27] chrXIII ? ? ?- ? 36M ? ? 36 804955 804990 ? ?36 ? ?0 > [28] chrXIII ? ? ?- ? 36M ? ? 36 804974 805009 ? ?36 ? ?0 >> exonRanges[6646] > GRangesList of length 1 > $6646 > GRanges with 1 range and 3 elementMetadata values > ? ?seqnames ? ? ? ? ? ranges strand | ? exon_id ? exon_name exon_rank > ? ? ? <rle> ? ? ? ?<iranges> ?<rle> | <integer> <character> <integer> > [1] ?chrXIII [804455, 805090] ? ? ?+ | ? ? ?7011 ? ? ? ? ?NA ? ? ? ? 1 > > > seqlengths > ? chrIV ? chrXV ?chrVII ?chrXII ?chrXVI ... ? chrVI ? ?chrI ? ?chrM 2micron > ?1531919 1091289 1090947 1078175 ?948062 ... ?270148 ?230208 ? 85779 ? ?6318 >> counts[6646] > [1] 13 >> table(strand(aligns[705:732])) > > ?+ ?- > 13 15 > > >> sessionInfo() > R version 2.12.0 (2010-10-15) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > ?[1] LC_CTYPE=en_US.UTF-8 ? ? ? LC_NUMERIC=C > ?[3] LC_TIME=en_US.UTF-8 ? ? ? ?LC_COLLATE=en_US.UTF-8 > ?[5] LC_MONETARY=C ? ? ? ? ? ? ?LC_MESSAGES=en_US.UTF-8 > ?[7] LC_PAPER=en_US.UTF-8 ? ? ? LC_NAME=C > ?[9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base > > other attached packages: > [1] Rsamtools_1.2.1 ? ? ? Biostrings_2.18.0 ? ? GenomicFeatures_1.2.3 > [4] GenomicRanges_1.2.1 ? IRanges_1.8.2 > > loaded via a namespace (and not attached): > [1] Biobase_2.10.0 ? ? biomaRt_2.6.0 ? ? ?BSgenome_1.18.1 ? ?DBI_0.2-5 > [5] RCurl_1.4-3 ? ? ? ?RSQLite_0.9-3 ? ? ?rtracklayer_1.10.5 tools_2.12.0 > [9] XML_3.2-0 > > Song Li > -- > Postdoctoral Associate > Institute for Genome Sciences and Policy > Duke University > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
On 12/13/2010 09:30 AM, Vincent Carey wrote: > On Mon, Dec 13, 2010 at 12:06 PM, Song Li <songli116 at="" gmail.com=""> wrote: >> Hi All, >> >> I was following the example in the "GenomicRanges Use Cases", section >> 3, "Simple RNA-seq Analysis", by "copy-paste" command from the >> document from page 7 to page 8: >> >> What I found out is that countOverlap() only count reads aligned to >> positive strand. There are 28 reads map to region GRList[6645], 13 on >> positive strand, 15 on negative strand. The "count[6645]" is 13. >> >> Is there a way to count both strand? > > If you want to disregard strand of exon range, set it to "*". This > does not appear to be a one-liner. it's tricky to set strand() on a GappedAlignment (because the CIGAR has to be adjusted appropriately) but the I believe that the converse strand(exonRanges) <- "*" accomplishes the same goal -- reads are counted without regard to strand of alignment. Martin > >> >> Thanks, >> Song Li >> >> Here are all the code: >> #-----> Start with code from the package description<-------# >>> library(Rsamtools) >>> testFile <- system.file("bam", "isowt5_13e.bam", package = "leeBamViews") >>> aligns <- readBamGappedAlignments(testFile) >>> library(GenomicFeatures) >>> txdb <- makeTranscriptDbFromUCSC(genome = "sacCer2", tablename = "sgdGene") >>> exonRanges <- exonsBy(txdb, "tx") >>> length(exonRanges) >>> levels(rname(aligns)) <- c(paste("chr", as.roman(1:16), >> + sep = ""), "chrM") >>> counts <- countOverlaps(exonRanges, aligns) >> >> #-----> find alignments that map to GRList$6645<-------# >>> aligns[705:732] >> GappedAlignments of length 28 >> rname strand cigar qwidth start end width ngap >> [1] chrXIII + 36M 36 804443 804478 36 0 >> [2] chrXIII + 36M 36 804466 804501 36 0 >> [3] chrXIII - 36M 36 804473 804508 36 0 >> [4] chrXIII + 36M 36 804476 804511 36 0 >> [5] chrXIII - 36M 36 804493 804528 36 0 >> [6] chrXIII + 36M 36 804516 804551 36 0 >> [7] chrXIII + 36M 36 804562 804597 36 0 >> [8] chrXIII + 36M 36 804562 804597 36 0 >> [9] chrXIII - 36M 36 804574 804609 36 0 >> ... ... ... ... ... ... ... ... ... >> [20] chrXIII + 36M 36 804764 804799 36 0 >> [21] chrXIII - 36M 36 804798 804833 36 0 >> [22] chrXIII + 36M 36 804799 804834 36 0 >> [23] chrXIII + 36M 36 804799 804834 36 0 >> [24] chrXIII - 36M 36 804816 804851 36 0 >> [25] chrXIII - 36M 36 804947 804982 36 0 >> [26] chrXIII - 36M 36 804953 804988 36 0 >> [27] chrXIII - 36M 36 804955 804990 36 0 >> [28] chrXIII - 36M 36 804974 805009 36 0 >>> exonRanges[6646] >> GRangesList of length 1 >> $6646 >> GRanges with 1 range and 3 elementMetadata values >> seqnames ranges strand | exon_id exon_name exon_rank >> <rle> <iranges> <rle> | <integer> <character> <integer> >> [1] chrXIII [804455, 805090] + | 7011 NA 1 >> >> >> seqlengths >> chrIV chrXV chrVII chrXII chrXVI ... chrVI chrI chrM 2micron >> 1531919 1091289 1090947 1078175 948062 ... 270148 230208 85779 6318 >>> counts[6646] >> [1] 13 >>> table(strand(aligns[705:732])) >> >> + - >> 13 15 >> >> >>> sessionInfo() >> R version 2.12.0 (2010-10-15) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] Rsamtools_1.2.1 Biostrings_2.18.0 GenomicFeatures_1.2.3 >> [4] GenomicRanges_1.2.1 IRanges_1.8.2 >> >> loaded via a namespace (and not attached): >> [1] Biobase_2.10.0 biomaRt_2.6.0 BSgenome_1.18.1 DBI_0.2-5 >> [5] RCurl_1.4-3 RSQLite_0.9-3 rtracklayer_1.10.5 tools_2.12.0 >> [9] XML_3.2-0 >> >> Song Li >> -- >> Postdoctoral Associate >> Institute for Genome Sciences and Policy >> Duke University >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793
ADD REPLY
0
Entering edit mode
Hi Martin and Vincent, Thank you for your replies, here is my code: > strand(exonRanges)<-"*" Error in `strand<-`(`*tmp*`, value = "*") : replacement 'value' is not an AtomicList with the same elementLengths as 'x' > levels(strand(aligns))<-c('*','*','*') Error in function (classes, fdef, mtable) : unable to find an inherited method for function "strand<-", for signature "GappedAlignments" It does not seem to be that straightforward. I have been searching for ways to make modification to the GappedAlignments and GRangesList objects. Song On Mon, Dec 13, 2010 at 12:39 PM, Martin Morgan <mtmorgan at="" fhcrc.org=""> wrote: > On 12/13/2010 09:30 AM, Vincent Carey wrote: >> On Mon, Dec 13, 2010 at 12:06 PM, Song Li <songli116 at="" gmail.com=""> wrote: >>> Hi All, >>> >>> I was following the example in the "GenomicRanges Use Cases", section >>> 3, "Simple RNA-seq Analysis", by "copy-paste" command from the >>> document from page 7 to page 8: >>> >>> What I found out is that countOverlap() only count reads aligned to >>> positive strand. ?There are 28 reads map to region GRList[6645], 13 on >>> positive strand, 15 on negative strand. ?The "count[6645]" is 13. >>> >>> Is there a way to count both strand? >> >> If you want to disregard strand of exon range, set it to "*". ?This >> does not appear to be a one-liner. > > it's tricky to set strand() on a GappedAlignment (because the CIGAR has > to be adjusted appropriately) but the I believe that the converse > > ?strand(exonRanges) <- "*" > > accomplishes the same goal -- reads are counted without regard to strand > of alignment. > > Martin > >> >>> >>> Thanks, >>> Song Li >>> >>> Here are all the code: >>> #-----> Start with code from the package description<-------# >>>> library(Rsamtools) >>>> testFile <- system.file("bam", "isowt5_13e.bam", package = "leeBamViews") >>>> aligns <- readBamGappedAlignments(testFile) >>>> library(GenomicFeatures) >>>> txdb <- makeTranscriptDbFromUCSC(genome = "sacCer2", tablename = "sgdGene") >>>> exonRanges <- exonsBy(txdb, "tx") >>>> length(exonRanges) >>>> levels(rname(aligns)) <- c(paste("chr", as.roman(1:16), >>> + ? ? sep = ""), "chrM") >>>> counts <- countOverlaps(exonRanges, aligns) >>> >>> #-----> find alignments that map to GRList$6645<-------# >>>> aligns[705:732] >>> GappedAlignments of length 28 >>> ? ? ? rname strand cigar qwidth ?start ? ?end width ngap >>> [1] ?chrXIII ? ? ?+ ? 36M ? ? 36 804443 804478 ? ?36 ? ?0 >>> [2] ?chrXIII ? ? ?+ ? 36M ? ? 36 804466 804501 ? ?36 ? ?0 >>> [3] ?chrXIII ? ? ?- ? 36M ? ? 36 804473 804508 ? ?36 ? ?0 >>> [4] ?chrXIII ? ? ?+ ? 36M ? ? 36 804476 804511 ? ?36 ? ?0 >>> [5] ?chrXIII ? ? ?- ? 36M ? ? 36 804493 804528 ? ?36 ? ?0 >>> [6] ?chrXIII ? ? ?+ ? 36M ? ? 36 804516 804551 ? ?36 ? ?0 >>> [7] ?chrXIII ? ? ?+ ? 36M ? ? 36 804562 804597 ? ?36 ? ?0 >>> [8] ?chrXIII ? ? ?+ ? 36M ? ? 36 804562 804597 ? ?36 ? ?0 >>> [9] ?chrXIII ? ? ?- ? 36M ? ? 36 804574 804609 ? ?36 ? ?0 >>> ... ? ? ?... ? ?... ? ... ? ?... ? ?... ? ?... ? ... ?... >>> [20] chrXIII ? ? ?+ ? 36M ? ? 36 804764 804799 ? ?36 ? ?0 >>> [21] chrXIII ? ? ?- ? 36M ? ? 36 804798 804833 ? ?36 ? ?0 >>> [22] chrXIII ? ? ?+ ? 36M ? ? 36 804799 804834 ? ?36 ? ?0 >>> [23] chrXIII ? ? ?+ ? 36M ? ? 36 804799 804834 ? ?36 ? ?0 >>> [24] chrXIII ? ? ?- ? 36M ? ? 36 804816 804851 ? ?36 ? ?0 >>> [25] chrXIII ? ? ?- ? 36M ? ? 36 804947 804982 ? ?36 ? ?0 >>> [26] chrXIII ? ? ?- ? 36M ? ? 36 804953 804988 ? ?36 ? ?0 >>> [27] chrXIII ? ? ?- ? 36M ? ? 36 804955 804990 ? ?36 ? ?0 >>> [28] chrXIII ? ? ?- ? 36M ? ? 36 804974 805009 ? ?36 ? ?0 >>>> exonRanges[6646] >>> GRangesList of length 1 >>> $6646 >>> GRanges with 1 range and 3 elementMetadata values >>> ? ?seqnames ? ? ? ? ? ranges strand | ? exon_id ? exon_name exon_rank >>> ? ? ? <rle> ? ? ? ?<iranges> ?<rle> | <integer> <character> <integer> >>> [1] ?chrXIII [804455, 805090] ? ? ?+ | ? ? ?7011 ? ? ? ? ?NA ? ? ? ? 1 >>> >>> >>> seqlengths >>> ? chrIV ? chrXV ?chrVII ?chrXII ?chrXVI ... ? chrVI ? ?chrI ? ?chrM 2micron >>> ?1531919 1091289 1090947 1078175 ?948062 ... ?270148 ?230208 ? 85779 ? ?6318 >>>> counts[6646] >>> [1] 13 >>>> table(strand(aligns[705:732])) >>> >>> ?+ ?- >>> 13 15 >>> >>> >>>> sessionInfo() >>> R version 2.12.0 (2010-10-15) >>> Platform: x86_64-unknown-linux-gnu (64-bit) >>> >>> locale: >>> ?[1] LC_CTYPE=en_US.UTF-8 ? ? ? LC_NUMERIC=C >>> ?[3] LC_TIME=en_US.UTF-8 ? ? ? ?LC_COLLATE=en_US.UTF-8 >>> ?[5] LC_MONETARY=C ? ? ? ? ? ? ?LC_MESSAGES=en_US.UTF-8 >>> ?[7] LC_PAPER=en_US.UTF-8 ? ? ? LC_NAME=C >>> ?[9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C >>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>> >>> attached base packages: >>> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >>> >>> other attached packages: >>> [1] Rsamtools_1.2.1 ? ? ? Biostrings_2.18.0 ? ? GenomicFeatures_1.2.3 >>> [4] GenomicRanges_1.2.1 ? IRanges_1.8.2 >>> >>> loaded via a namespace (and not attached): >>> [1] Biobase_2.10.0 ? ? biomaRt_2.6.0 ? ? ?BSgenome_1.18.1 ? ?DBI_0.2-5 >>> [5] RCurl_1.4-3 ? ? ? ?RSQLite_0.9-3 ? ? ?rtracklayer_1.10.5 tools_2.12.0 >>> [9] XML_3.2-0 >>> >>> Song Li >>> -- >>> Postdoctoral Associate >>> Institute for Genome Sciences and Policy >>> Duke University >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > Computational Biology > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > Location: M1-B861 > Telephone: 206 667-2793 > -- Postdoctoral Associate Institute for Genome Sciences and Policy Duke University
ADD REPLY
0
Entering edit mode
On 12/13/2010 10:09 AM, Song Li wrote: > Hi Martin and Vincent, > > Thank you for your replies, here is my code: > >> strand(exonRanges)<-"*" > Error in `strand<-`(`*tmp*`, value = "*") : > replacement 'value' is not an AtomicList with the same elementLengths as 'x' create a list of character vectors, each of the appropriate length value <- lapply(elementLengths(exonRanges), rep, x="*") convert this list to a class derived from 'AtomicList', and do the replacement strand(exonRanges) <- CharacterList(value) Sometimes it's easier / faster to work with exons(txdb, column="tx_id") or similar. Martin > >> levels(strand(aligns))<-c('*','*','*') > Error in function (classes, fdef, mtable) : > unable to find an inherited method for function "strand<-", for > signature "GappedAlignments" > > It does not seem to be that straightforward. > > I have been searching for ways to make modification to the > GappedAlignments and GRangesList objects. > > Song > > On Mon, Dec 13, 2010 at 12:39 PM, Martin Morgan <mtmorgan at="" fhcrc.org=""> wrote: >> On 12/13/2010 09:30 AM, Vincent Carey wrote: >>> On Mon, Dec 13, 2010 at 12:06 PM, Song Li <songli116 at="" gmail.com=""> wrote: >>>> Hi All, >>>> >>>> I was following the example in the "GenomicRanges Use Cases", section >>>> 3, "Simple RNA-seq Analysis", by "copy-paste" command from the >>>> document from page 7 to page 8: >>>> >>>> What I found out is that countOverlap() only count reads aligned to >>>> positive strand. There are 28 reads map to region GRList[6645], 13 on >>>> positive strand, 15 on negative strand. The "count[6645]" is 13. >>>> >>>> Is there a way to count both strand? >>> >>> If you want to disregard strand of exon range, set it to "*". This >>> does not appear to be a one-liner. >> >> it's tricky to set strand() on a GappedAlignment (because the CIGAR has >> to be adjusted appropriately) but the I believe that the converse >> >> strand(exonRanges) <- "*" >> >> accomplishes the same goal -- reads are counted without regard to strand >> of alignment. >> >> Martin >> >>> >>>> >>>> Thanks, >>>> Song Li >>>> >>>> Here are all the code: >>>> #-----> Start with code from the package description<-------# >>>>> library(Rsamtools) >>>>> testFile <- system.file("bam", "isowt5_13e.bam", package = "leeBamViews") >>>>> aligns <- readBamGappedAlignments(testFile) >>>>> library(GenomicFeatures) >>>>> txdb <- makeTranscriptDbFromUCSC(genome = "sacCer2", tablename = "sgdGene") >>>>> exonRanges <- exonsBy(txdb, "tx") >>>>> length(exonRanges) >>>>> levels(rname(aligns)) <- c(paste("chr", as.roman(1:16), >>>> + sep = ""), "chrM") >>>>> counts <- countOverlaps(exonRanges, aligns) >>>> >>>> #-----> find alignments that map to GRList$6645<-------# >>>>> aligns[705:732] >>>> GappedAlignments of length 28 >>>> rname strand cigar qwidth start end width ngap >>>> [1] chrXIII + 36M 36 804443 804478 36 0 >>>> [2] chrXIII + 36M 36 804466 804501 36 0 >>>> [3] chrXIII - 36M 36 804473 804508 36 0 >>>> [4] chrXIII + 36M 36 804476 804511 36 0 >>>> [5] chrXIII - 36M 36 804493 804528 36 0 >>>> [6] chrXIII + 36M 36 804516 804551 36 0 >>>> [7] chrXIII + 36M 36 804562 804597 36 0 >>>> [8] chrXIII + 36M 36 804562 804597 36 0 >>>> [9] chrXIII - 36M 36 804574 804609 36 0 >>>> ... ... ... ... ... ... ... ... ... >>>> [20] chrXIII + 36M 36 804764 804799 36 0 >>>> [21] chrXIII - 36M 36 804798 804833 36 0 >>>> [22] chrXIII + 36M 36 804799 804834 36 0 >>>> [23] chrXIII + 36M 36 804799 804834 36 0 >>>> [24] chrXIII - 36M 36 804816 804851 36 0 >>>> [25] chrXIII - 36M 36 804947 804982 36 0 >>>> [26] chrXIII - 36M 36 804953 804988 36 0 >>>> [27] chrXIII - 36M 36 804955 804990 36 0 >>>> [28] chrXIII - 36M 36 804974 805009 36 0 >>>>> exonRanges[6646] >>>> GRangesList of length 1 >>>> $6646 >>>> GRanges with 1 range and 3 elementMetadata values >>>> seqnames ranges strand | exon_id exon_name exon_rank >>>> <rle> <iranges> <rle> | <integer> <character> <integer> >>>> [1] chrXIII [804455, 805090] + | 7011 NA 1 >>>> >>>> >>>> seqlengths >>>> chrIV chrXV chrVII chrXII chrXVI ... chrVI chrI chrM 2micron >>>> 1531919 1091289 1090947 1078175 948062 ... 270148 230208 85779 6318 >>>>> counts[6646] >>>> [1] 13 >>>>> table(strand(aligns[705:732])) >>>> >>>> + - >>>> 13 15 >>>> >>>> >>>>> sessionInfo() >>>> R version 2.12.0 (2010-10-15) >>>> Platform: x86_64-unknown-linux-gnu (64-bit) >>>> >>>> locale: >>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >>>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 >>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >>>> [9] LC_ADDRESS=C LC_TELEPHONE=C >>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>>> >>>> attached base packages: >>>> [1] stats graphics grDevices utils datasets methods base >>>> >>>> other attached packages: >>>> [1] Rsamtools_1.2.1 Biostrings_2.18.0 GenomicFeatures_1.2.3 >>>> [4] GenomicRanges_1.2.1 IRanges_1.8.2 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] Biobase_2.10.0 biomaRt_2.6.0 BSgenome_1.18.1 DBI_0.2-5 >>>> [5] RCurl_1.4-3 RSQLite_0.9-3 rtracklayer_1.10.5 tools_2.12.0 >>>> [9] XML_3.2-0 >>>> >>>> Song Li >>>> -- >>>> Postdoctoral Associate >>>> Institute for Genome Sciences and Policy >>>> Duke University >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> -- >> Computational Biology >> Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 >> >> Location: M1-B861 >> Telephone: 206 667-2793 >> > > > -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793
ADD REPLY
0
Entering edit mode
Hi Martin and Vincent, Thank you for the replies and both methods work, Cheers, Song On Mon, Dec 13, 2010 at 2:19 PM, Martin Morgan <mtmorgan at="" fhcrc.org=""> wrote: > On 12/13/2010 10:09 AM, Song Li wrote: >> Hi Martin and Vincent, >> >> Thank you for your replies, here is my code: >> >>> strand(exonRanges)<-"*" >> Error in `strand<-`(`*tmp*`, value = "*") : >> ? replacement 'value' is not an AtomicList with the same elementLengths as 'x' > > create a list of character vectors, each of the appropriate length > > ?value <- lapply(elementLengths(exonRanges), rep, x="*") > > convert this list to a class derived from 'AtomicList', and do the > replacement > > ?strand(exonRanges) <- CharacterList(value) > > Sometimes it's easier / faster to work with exons(txdb, column="tx_id") > or similar. > > Martin > > >> >>> levels(strand(aligns))<-c('*','*','*') >> Error in function (classes, fdef, mtable) ?: >> ? unable to find an inherited method for function "strand<-", for >> signature "GappedAlignments" >> >> It does not seem to be that straightforward. >> >> I have been searching for ways to make modification to the >> GappedAlignments and GRangesList ?objects. >> >> Song >> >> On Mon, Dec 13, 2010 at 12:39 PM, Martin Morgan <mtmorgan at="" fhcrc.org=""> wrote: >>> On 12/13/2010 09:30 AM, Vincent Carey wrote: >>>> On Mon, Dec 13, 2010 at 12:06 PM, Song Li <songli116 at="" gmail.com=""> wrote: >>>>> Hi All, >>>>> >>>>> I was following the example in the "GenomicRanges Use Cases", section >>>>> 3, "Simple RNA-seq Analysis", by "copy-paste" command from the >>>>> document from page 7 to page 8: >>>>> >>>>> What I found out is that countOverlap() only count reads aligned to >>>>> positive strand. ?There are 28 reads map to region GRList[6645], 13 on >>>>> positive strand, 15 on negative strand. ?The "count[6645]" is 13. >>>>> >>>>> Is there a way to count both strand? >>>> >>>> If you want to disregard strand of exon range, set it to "*". ?This >>>> does not appear to be a one-liner. >>> >>> it's tricky to set strand() on a GappedAlignment (because the CIGAR has >>> to be adjusted appropriately) but the I believe that the converse >>> >>> ?strand(exonRanges) <- "*" >>> >>> accomplishes the same goal -- reads are counted without regard to strand >>> of alignment. >>> >>> Martin >>> >>>> >>>>> >>>>> Thanks, >>>>> Song Li >>>>> >>>>> Here are all the code: >>>>> #-----> Start with code from the package description<-------# >>>>>> library(Rsamtools) >>>>>> testFile <- system.file("bam", "isowt5_13e.bam", package = "leeBamViews") >>>>>> aligns <- readBamGappedAlignments(testFile) >>>>>> library(GenomicFeatures) >>>>>> txdb <- makeTranscriptDbFromUCSC(genome = "sacCer2", tablename = "sgdGene") >>>>>> exonRanges <- exonsBy(txdb, "tx") >>>>>> length(exonRanges) >>>>>> levels(rname(aligns)) <- c(paste("chr", as.roman(1:16), >>>>> + ? ? sep = ""), "chrM") >>>>>> counts <- countOverlaps(exonRanges, aligns) >>>>> >>>>> #-----> find alignments that map to GRList$6645<-------# >>>>>> aligns[705:732] >>>>> GappedAlignments of length 28 >>>>> ? ? ? rname strand cigar qwidth ?start ? ?end width ngap >>>>> [1] ?chrXIII ? ? ?+ ? 36M ? ? 36 804443 804478 ? ?36 ? ?0 >>>>> [2] ?chrXIII ? ? ?+ ? 36M ? ? 36 804466 804501 ? ?36 ? ?0 >>>>> [3] ?chrXIII ? ? ?- ? 36M ? ? 36 804473 804508 ? ?36 ? ?0 >>>>> [4] ?chrXIII ? ? ?+ ? 36M ? ? 36 804476 804511 ? ?36 ? ?0 >>>>> [5] ?chrXIII ? ? ?- ? 36M ? ? 36 804493 804528 ? ?36 ? ?0 >>>>> [6] ?chrXIII ? ? ?+ ? 36M ? ? 36 804516 804551 ? ?36 ? ?0 >>>>> [7] ?chrXIII ? ? ?+ ? 36M ? ? 36 804562 804597 ? ?36 ? ?0 >>>>> [8] ?chrXIII ? ? ?+ ? 36M ? ? 36 804562 804597 ? ?36 ? ?0 >>>>> [9] ?chrXIII ? ? ?- ? 36M ? ? 36 804574 804609 ? ?36 ? ?0 >>>>> ... ? ? ?... ? ?... ? ... ? ?... ? ?... ? ?... ? ... ?... >>>>> [20] chrXIII ? ? ?+ ? 36M ? ? 36 804764 804799 ? ?36 ? ?0 >>>>> [21] chrXIII ? ? ?- ? 36M ? ? 36 804798 804833 ? ?36 ? ?0 >>>>> [22] chrXIII ? ? ?+ ? 36M ? ? 36 804799 804834 ? ?36 ? ?0 >>>>> [23] chrXIII ? ? ?+ ? 36M ? ? 36 804799 804834 ? ?36 ? ?0 >>>>> [24] chrXIII ? ? ?- ? 36M ? ? 36 804816 804851 ? ?36 ? ?0 >>>>> [25] chrXIII ? ? ?- ? 36M ? ? 36 804947 804982 ? ?36 ? ?0 >>>>> [26] chrXIII ? ? ?- ? 36M ? ? 36 804953 804988 ? ?36 ? ?0 >>>>> [27] chrXIII ? ? ?- ? 36M ? ? 36 804955 804990 ? ?36 ? ?0 >>>>> [28] chrXIII ? ? ?- ? 36M ? ? 36 804974 805009 ? ?36 ? ?0 >>>>>> exonRanges[6646] >>>>> GRangesList of length 1 >>>>> $6646 >>>>> GRanges with 1 range and 3 elementMetadata values >>>>> ? ?seqnames ? ? ? ? ? ranges strand | ? exon_id ? exon_name exon_rank >>>>> ? ? ? <rle> ? ? ? ?<iranges> ?<rle> | <integer> <character> <integer> >>>>> [1] ?chrXIII [804455, 805090] ? ? ?+ | ? ? ?7011 ? ? ? ? ?NA ? ? ? ? 1 >>>>> >>>>> >>>>> seqlengths >>>>> ? chrIV ? chrXV ?chrVII ?chrXII ?chrXVI ... ? chrVI ? ?chrI ? ?chrM 2micron >>>>> ?1531919 1091289 1090947 1078175 ?948062 ... ?270148 ?230208 ? 85779 ? ?6318 >>>>>> counts[6646] >>>>> [1] 13 >>>>>> table(strand(aligns[705:732])) >>>>> >>>>> ?+ ?- >>>>> 13 15 >>>>> >>>>> >>>>>> sessionInfo() >>>>> R version 2.12.0 (2010-10-15) >>>>> Platform: x86_64-unknown-linux-gnu (64-bit) >>>>> >>>>> locale: >>>>> ?[1] LC_CTYPE=en_US.UTF-8 ? ? ? LC_NUMERIC=C >>>>> ?[3] LC_TIME=en_US.UTF-8 ? ? ? ?LC_COLLATE=en_US.UTF-8 >>>>> ?[5] LC_MONETARY=C ? ? ? ? ? ? ?LC_MESSAGES=en_US.UTF-8 >>>>> ?[7] LC_PAPER=en_US.UTF-8 ? ? ? LC_NAME=C >>>>> ?[9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C >>>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>>>> >>>>> attached base packages: >>>>> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >>>>> >>>>> other attached packages: >>>>> [1] Rsamtools_1.2.1 ? ? ? Biostrings_2.18.0 ? ? GenomicFeatures_1.2.3 >>>>> [4] GenomicRanges_1.2.1 ? IRanges_1.8.2 >>>>> >>>>> loaded via a namespace (and not attached): >>>>> [1] Biobase_2.10.0 ? ? biomaRt_2.6.0 ? ? ?BSgenome_1.18.1 ? ?DBI_0.2-5 >>>>> [5] RCurl_1.4-3 ? ? ? ?RSQLite_0.9-3 ? ? ?rtracklayer_1.10.5 tools_2.12.0 >>>>> [9] XML_3.2-0 >>>>> >>>>> Song Li >>>>> -- >>>>> Postdoctoral Associate >>>>> Institute for Genome Sciences and Policy >>>>> Duke University >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at r-project.org >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>> >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> >>> -- >>> Computational Biology >>> Fred Hutchinson Cancer Research Center >>> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 >>> >>> Location: M1-B861 >>> Telephone: 206 667-2793 >>> >> >> >> > > > -- > Computational Biology > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > Location: M1-B861 > Telephone: 206 667-2793 > -- Postdoctoral Associate Institute for Genome Sciences and Policy Duke University
ADD REPLY
0
Entering edit mode
On Mon, Dec 13, 2010 at 11:19 AM, Martin Morgan <mtmorgan@fhcrc.org> wrote: > On 12/13/2010 10:09 AM, Song Li wrote: > > Hi Martin and Vincent, > > > > Thank you for your replies, here is my code: > > > >> strand(exonRanges)<-"*" > > Error in `strand<-`(`*tmp*`, value = "*") : > > replacement 'value' is not an AtomicList with the same elementLengths > as 'x' > > create a list of character vectors, each of the appropriate length > > value <- lapply(elementLengths(exonRanges), rep, x="*") > > Just for the record, seqapply is useful here, as it automatically does the casting to CharacterList: value <- seqapply(elementLengths(exonRanges), rep, x="*") Even though rep is a primitive, it's generally faster but messier to do this with rep/seq_len, i.e., el <- elementLengths(exonRanges) strand(exonRanges) <- seqsplit(rep("*", sum(el)), rep(seq_len(length(exonRanges)), el)) Note that 'seqsplit' also does the automatic coercion. I'll soon check in an optimization to create a CompressedCharacterList without actually splitting anything. This is more efficient in time and space. convert this list to a class derived from 'AtomicList', and do the > replacement > > strand(exonRanges) <- CharacterList(value) > > Sometimes it's easier / faster to work with exons(txdb, column="tx_id") > or similar. > > Martin > > > > > >> levels(strand(aligns))<-c('*','*','*') > > Error in function (classes, fdef, mtable) : > > unable to find an inherited method for function "strand<-", for > > signature "GappedAlignments" > > > > It does not seem to be that straightforward. > > > > I have been searching for ways to make modification to the > > GappedAlignments and GRangesList objects. > > > > Song > > > > On Mon, Dec 13, 2010 at 12:39 PM, Martin Morgan <mtmorgan@fhcrc.org> > wrote: > >> On 12/13/2010 09:30 AM, Vincent Carey wrote: > >>> On Mon, Dec 13, 2010 at 12:06 PM, Song Li <songli116@gmail.com> wrote: > >>>> Hi All, > >>>> > >>>> I was following the example in the "GenomicRanges Use Cases", section > >>>> 3, "Simple RNA-seq Analysis", by "copy-paste" command from the > >>>> document from page 7 to page 8: > >>>> > >>>> What I found out is that countOverlap() only count reads aligned to > >>>> positive strand. There are 28 reads map to region GRList[6645], 13 on > >>>> positive strand, 15 on negative strand. The "count[6645]" is 13. > >>>> > >>>> Is there a way to count both strand? > >>> > >>> If you want to disregard strand of exon range, set it to "*". This > >>> does not appear to be a one-liner. > >> > >> it's tricky to set strand() on a GappedAlignment (because the CIGAR has > >> to be adjusted appropriately) but the I believe that the converse > >> > >> strand(exonRanges) <- "*" > >> > >> accomplishes the same goal -- reads are counted without regard to strand > >> of alignment. > >> > >> Martin > >> > >>> > >>>> > >>>> Thanks, > >>>> Song Li > >>>> > >>>> Here are all the code: > >>>> #-----> Start with code from the package description<-------# > >>>>> library(Rsamtools) > >>>>> testFile <- system.file("bam", "isowt5_13e.bam", package = > "leeBamViews") > >>>>> aligns <- readBamGappedAlignments(testFile) > >>>>> library(GenomicFeatures) > >>>>> txdb <- makeTranscriptDbFromUCSC(genome = "sacCer2", tablename = > "sgdGene") > >>>>> exonRanges <- exonsBy(txdb, "tx") > >>>>> length(exonRanges) > >>>>> levels(rname(aligns)) <- c(paste("chr", as.roman(1:16), > >>>> + sep = ""), "chrM") > >>>>> counts <- countOverlaps(exonRanges, aligns) > >>>> > >>>> #-----> find alignments that map to GRList$6645<-------# > >>>>> aligns[705:732] > >>>> GappedAlignments of length 28 > >>>> rname strand cigar qwidth start end width ngap > >>>> [1] chrXIII + 36M 36 804443 804478 36 0 > >>>> [2] chrXIII + 36M 36 804466 804501 36 0 > >>>> [3] chrXIII - 36M 36 804473 804508 36 0 > >>>> [4] chrXIII + 36M 36 804476 804511 36 0 > >>>> [5] chrXIII - 36M 36 804493 804528 36 0 > >>>> [6] chrXIII + 36M 36 804516 804551 36 0 > >>>> [7] chrXIII + 36M 36 804562 804597 36 0 > >>>> [8] chrXIII + 36M 36 804562 804597 36 0 > >>>> [9] chrXIII - 36M 36 804574 804609 36 0 > >>>> ... ... ... ... ... ... ... ... ... > >>>> [20] chrXIII + 36M 36 804764 804799 36 0 > >>>> [21] chrXIII - 36M 36 804798 804833 36 0 > >>>> [22] chrXIII + 36M 36 804799 804834 36 0 > >>>> [23] chrXIII + 36M 36 804799 804834 36 0 > >>>> [24] chrXIII - 36M 36 804816 804851 36 0 > >>>> [25] chrXIII - 36M 36 804947 804982 36 0 > >>>> [26] chrXIII - 36M 36 804953 804988 36 0 > >>>> [27] chrXIII - 36M 36 804955 804990 36 0 > >>>> [28] chrXIII - 36M 36 804974 805009 36 0 > >>>>> exonRanges[6646] > >>>> GRangesList of length 1 > >>>> $6646 > >>>> GRanges with 1 range and 3 elementMetadata values > >>>> seqnames ranges strand | exon_id exon_name exon_rank > >>>> <rle> <iranges> <rle> | <integer> <character> <integer> > >>>> [1] chrXIII [804455, 805090] + | 7011 NA 1 > >>>> > >>>> > >>>> seqlengths > >>>> chrIV chrXV chrVII chrXII chrXVI ... chrVI chrI chrM > 2micron > >>>> 1531919 1091289 1090947 1078175 948062 ... 270148 230208 85779 > 6318 > >>>>> counts[6646] > >>>> [1] 13 > >>>>> table(strand(aligns[705:732])) > >>>> > >>>> + - > >>>> 13 15 > >>>> > >>>> > >>>>> sessionInfo() > >>>> R version 2.12.0 (2010-10-15) > >>>> Platform: x86_64-unknown-linux-gnu (64-bit) > >>>> > >>>> locale: > >>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > >>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > >>>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 > >>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > >>>> [9] LC_ADDRESS=C LC_TELEPHONE=C > >>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > >>>> > >>>> attached base packages: > >>>> [1] stats graphics grDevices utils datasets methods base > >>>> > >>>> other attached packages: > >>>> [1] Rsamtools_1.2.1 Biostrings_2.18.0 GenomicFeatures_1.2.3 > >>>> [4] GenomicRanges_1.2.1 IRanges_1.8.2 > >>>> > >>>> loaded via a namespace (and not attached): > >>>> [1] Biobase_2.10.0 biomaRt_2.6.0 BSgenome_1.18.1 DBI_0.2-5 > >>>> [5] RCurl_1.4-3 RSQLite_0.9-3 rtracklayer_1.10.5 > tools_2.12.0 > >>>> [9] XML_3.2-0 > >>>> > >>>> Song Li > >>>> -- > >>>> Postdoctoral Associate > >>>> Institute for Genome Sciences and Policy > >>>> Duke University > >>>> > >>>> _______________________________________________ > >>>> Bioconductor mailing list > >>>> Bioconductor@r-project.org > >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>>> Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > >>>> > >>> > >>> _______________________________________________ > >>> Bioconductor mailing list > >>> Bioconductor@r-project.org > >>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>> Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > >> > >> > >> -- > >> Computational Biology > >> Fred Hutchinson Cancer Research Center > >> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > >> > >> Location: M1-B861 > >> Telephone: 206 667-2793 > >> > > > > > > > > > -- > Computational Biology > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > Location: M1-B861 > Telephone: 206 667-2793 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
For the nonce, the solution is to make the AtomicList with the desired content and reassign ser = strand(exonRanges) ss = endoapply(ser, function(x){x[] = "*"; x}) # overkill? slow... strand(exonRanges) = ss cc = countOverlaps(exonRanges, aligns) cc[6646] On Mon, Dec 13, 2010 at 1:09 PM, Song Li <songli116 at="" gmail.com=""> wrote: > Hi Martin and Vincent, > > Thank you for your replies, here is my code: > >> strand(exonRanges)<-"*" > Error in `strand<-`(`*tmp*`, value = "*") : > ?replacement 'value' is not an AtomicList with the same elementLengths as 'x' > >> levels(strand(aligns))<-c('*','*','*') > Error in function (classes, fdef, mtable) ?: > ?unable to find an inherited method for function "strand<-", for > signature "GappedAlignments" > > It does not seem to be that straightforward. > > I have been searching for ways to make modification to the > GappedAlignments and GRangesList ?objects. > > Song > > On Mon, Dec 13, 2010 at 12:39 PM, Martin Morgan <mtmorgan at="" fhcrc.org=""> wrote: >> On 12/13/2010 09:30 AM, Vincent Carey wrote: >>> On Mon, Dec 13, 2010 at 12:06 PM, Song Li <songli116 at="" gmail.com=""> wrote: >>>> Hi All, >>>> >>>> I was following the example in the "GenomicRanges Use Cases", section >>>> 3, "Simple RNA-seq Analysis", by "copy-paste" command from the >>>> document from page 7 to page 8: >>>> >>>> What I found out is that countOverlap() only count reads aligned to >>>> positive strand. ?There are 28 reads map to region GRList[6645], 13 on >>>> positive strand, 15 on negative strand. ?The "count[6645]" is 13. >>>> >>>> Is there a way to count both strand? >>> >>> If you want to disregard strand of exon range, set it to "*". ?This >>> does not appear to be a one-liner. >> >> it's tricky to set strand() on a GappedAlignment (because the CIGAR has >> to be adjusted appropriately) but the I believe that the converse >> >> ?strand(exonRanges) <- "*" >> >> accomplishes the same goal -- reads are counted without regard to strand >> of alignment. >> >> Martin >> >>> >>>> >>>> Thanks, >>>> Song Li >>>> >>>> Here are all the code: >>>> #-----> Start with code from the package description<-------# >>>>> library(Rsamtools) >>>>> testFile <- system.file("bam", "isowt5_13e.bam", package = "leeBamViews") >>>>> aligns <- readBamGappedAlignments(testFile) >>>>> library(GenomicFeatures) >>>>> txdb <- makeTranscriptDbFromUCSC(genome = "sacCer2", tablename = "sgdGene") >>>>> exonRanges <- exonsBy(txdb, "tx") >>>>> length(exonRanges) >>>>> levels(rname(aligns)) <- c(paste("chr", as.roman(1:16), >>>> + ? ? sep = ""), "chrM") >>>>> counts <- countOverlaps(exonRanges, aligns) >>>> >>>> #-----> find alignments that map to GRList$6645<-------# >>>>> aligns[705:732] >>>> GappedAlignments of length 28 >>>> ? ? ? rname strand cigar qwidth ?start ? ?end width ngap >>>> [1] ?chrXIII ? ? ?+ ? 36M ? ? 36 804443 804478 ? ?36 ? ?0 >>>> [2] ?chrXIII ? ? ?+ ? 36M ? ? 36 804466 804501 ? ?36 ? ?0 >>>> [3] ?chrXIII ? ? ?- ? 36M ? ? 36 804473 804508 ? ?36 ? ?0 >>>> [4] ?chrXIII ? ? ?+ ? 36M ? ? 36 804476 804511 ? ?36 ? ?0 >>>> [5] ?chrXIII ? ? ?- ? 36M ? ? 36 804493 804528 ? ?36 ? ?0 >>>> [6] ?chrXIII ? ? ?+ ? 36M ? ? 36 804516 804551 ? ?36 ? ?0 >>>> [7] ?chrXIII ? ? ?+ ? 36M ? ? 36 804562 804597 ? ?36 ? ?0 >>>> [8] ?chrXIII ? ? ?+ ? 36M ? ? 36 804562 804597 ? ?36 ? ?0 >>>> [9] ?chrXIII ? ? ?- ? 36M ? ? 36 804574 804609 ? ?36 ? ?0 >>>> ... ? ? ?... ? ?... ? ... ? ?... ? ?... ? ?... ? ... ?... >>>> [20] chrXIII ? ? ?+ ? 36M ? ? 36 804764 804799 ? ?36 ? ?0 >>>> [21] chrXIII ? ? ?- ? 36M ? ? 36 804798 804833 ? ?36 ? ?0 >>>> [22] chrXIII ? ? ?+ ? 36M ? ? 36 804799 804834 ? ?36 ? ?0 >>>> [23] chrXIII ? ? ?+ ? 36M ? ? 36 804799 804834 ? ?36 ? ?0 >>>> [24] chrXIII ? ? ?- ? 36M ? ? 36 804816 804851 ? ?36 ? ?0 >>>> [25] chrXIII ? ? ?- ? 36M ? ? 36 804947 804982 ? ?36 ? ?0 >>>> [26] chrXIII ? ? ?- ? 36M ? ? 36 804953 804988 ? ?36 ? ?0 >>>> [27] chrXIII ? ? ?- ? 36M ? ? 36 804955 804990 ? ?36 ? ?0 >>>> [28] chrXIII ? ? ?- ? 36M ? ? 36 804974 805009 ? ?36 ? ?0 >>>>> exonRanges[6646] >>>> GRangesList of length 1 >>>> $6646 >>>> GRanges with 1 range and 3 elementMetadata values >>>> ? ?seqnames ? ? ? ? ? ranges strand | ? exon_id ? exon_name exon_rank >>>> ? ? ? <rle> ? ? ? ?<iranges> ?<rle> | <integer> <character> <integer> >>>> [1] ?chrXIII [804455, 805090] ? ? ?+ | ? ? ?7011 ? ? ? ? ?NA ? ? ? ? 1 >>>> >>>> >>>> seqlengths >>>> ? chrIV ? chrXV ?chrVII ?chrXII ?chrXVI ... ? chrVI ? ?chrI ? ?chrM 2micron >>>> ?1531919 1091289 1090947 1078175 ?948062 ... ?270148 ?230208 ? 85779 ? ?6318 >>>>> counts[6646] >>>> [1] 13 >>>>> table(strand(aligns[705:732])) >>>> >>>> ?+ ?- >>>> 13 15 >>>> >>>> >>>>> sessionInfo() >>>> R version 2.12.0 (2010-10-15) >>>> Platform: x86_64-unknown-linux-gnu (64-bit) >>>> >>>> locale: >>>> ?[1] LC_CTYPE=en_US.UTF-8 ? ? ? LC_NUMERIC=C >>>> ?[3] LC_TIME=en_US.UTF-8 ? ? ? ?LC_COLLATE=en_US.UTF-8 >>>> ?[5] LC_MONETARY=C ? ? ? ? ? ? ?LC_MESSAGES=en_US.UTF-8 >>>> ?[7] LC_PAPER=en_US.UTF-8 ? ? ? LC_NAME=C >>>> ?[9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C >>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>>> >>>> attached base packages: >>>> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >>>> >>>> other attached packages: >>>> [1] Rsamtools_1.2.1 ? ? ? Biostrings_2.18.0 ? ? GenomicFeatures_1.2.3 >>>> [4] GenomicRanges_1.2.1 ? IRanges_1.8.2 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] Biobase_2.10.0 ? ? biomaRt_2.6.0 ? ? ?BSgenome_1.18.1 ? ?DBI_0.2-5 >>>> [5] RCurl_1.4-3 ? ? ? ?RSQLite_0.9-3 ? ? ?rtracklayer_1.10.5 tools_2.12.0 >>>> [9] XML_3.2-0 >>>> >>>> Song Li >>>> -- >>>> Postdoctoral Associate >>>> Institute for Genome Sciences and Policy >>>> Duke University >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> -- >> Computational Biology >> Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 >> >> Location: M1-B861 >> Telephone: 206 667-2793 >> > > > > -- > Postdoctoral Associate > Institute for Genome Sciences and Policy > Duke University >
ADD REPLY
0
Entering edit mode
jason0701 ▴ 190
@jason0701-3921
Last seen 4.5 years ago
Song Li <songli116 at="" ...=""> writes: > > Hi Martin and Vincent, > > Thank you for your replies, here is my code: > > > strand(exonRanges)<-"*" > Error in `strand<-`(`*tmp*`, value = "*") : > replacement 'value' is not an AtomicList with the same elementLengths as 'x' > > > levels(strand(aligns))<-c('*','*','*') > Error in function (classes, fdef, mtable) : > unable to find an inherited method for function "strand<-", for > signature "GappedAlignments" > > It does not seem to be that straightforward. > > I have been searching for ways to make modification to the > GappedAlignments and GRangesList objects. > > Song > I just encountered the same problem. What I did is this aln <- grg(readBamGappedAlignments(bam)) strand(aln) = "*" countOverlaps(exonRanges, aln) Jason
ADD COMMENT
0
Entering edit mode
Hi Song, Vince, Martin, Jason, On 12/13/2010 11:22 AM, Jason wrote: [...] > > I just encountered the same problem. > What I did is this > > aln<- grg(readBamGappedAlignments(bam)) > strand(aln) = "*" > countOverlaps(exonRanges, aln) FWIW I just added a "strand<-" method for GappedAlignments objects to the devel version of the GenomicRanges package (1.3.7). From the (updated) man page for GappedAlignments: > subject GRanges with 1 range and 0 elementMetadata values seqnames ranges strand | <rle> <iranges> <rle> | [1] seq1 [1, 36] + | > galn[8:10] GappedAlignments of length 3 rname strand cigar qwidth start end width ngap [1] seq1 + 35M 35 15 49 35 0 [2] seq1 - 35M 35 18 52 35 0 [3] seq1 + 35M 35 22 56 35 0 > countOverlaps(galn[8:10], subject) [1] 1 0 1 > strand(galn) <- "*" > galn[8:10] GappedAlignments of length 3 rname strand cigar qwidth start end width ngap [1] seq1 * 35M 35 15 49 35 0 [2] seq1 * 35M 35 18 52 35 0 [3] seq1 * 35M 35 22 56 35 0 > countOverlaps(galn[8:10], subject) [1] 1 1 1 Just to clarify: because GappedAlignments objects follow the Samtools convention to provide CIGARs that are *always* relative to the + strand (even for alignments that are on the - strand), modifying the strand of a GappedAlignments object doesn't alter its cigar field (or any other field). Cheers, H. > sessionInfo() R version 2.13.0 Under development (unstable) (2010-11-28 r53696) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8 [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8 [7] LC_PAPER=en_US.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] Rsamtools_1.3.11 Biostrings_2.19.2 GenomicRanges_1.3.7 [4] IRanges_1.9.17 loaded via a namespace (and not attached): [1] Biobase_2.11.6 tools_2.13.0 > > > Jason > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD REPLY

Login before adding your answer.

Traffic: 703 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6