Search
Question: Some more rtracklayer questions (re: WIG and .VSTEP files)
0
gravatar for Tim Triche
5.4 years ago by
Tim Triche4.1k
United States
Tim Triche4.1k wrote:
not entirely related, but not wholly unrelated; some things in rtracklayer I've been wondering about: 1) I try and import a WIG file and it takes over 24 hours. This seems like a Bad Thing, what am I doing wrong? 2) How should I import .vstep tracks? 3) Sometimes a BED file throws an error, what is the best way to debug this? import.ucsc(filename, subformat='bed')? and a GenomicRanges question: 1) suppose I have a large pile of segmented copy number calls. what's the best way to get a smoothed "pileup" of them? thanks much! for the tools and for the guidance. On Wed, Nov 23, 2011 at 8:38 AM, Michael Lawrence <lawrence.michael@gene.com> wrote: > On Wed, Nov 23, 2011 at 7:18 AM, Jason Lu <jasonlu68@gmail.com> wrote: > > > I want to thank Michael and Martin for being helpful. > > > > My apology for not showing the full script last email. The purpose > > here is to get a figure showing read coverage in the exon regions of > > my gene of interest (one gene). I tried some additional steps here and > > get an error below. > > > > # > > nm = "NM_000997" > > txs = exonsBy(txdb,by="tx",use.name=TRUE) > > toshow = txs[[nm]] > > ss <- > > readBamGappedAlignments("myalign.bam",param=ScanBamParam(which=tos how)) > > strand(ss) = "*" > > cvg = coverage(ss) # cvg: SimpleRleList > > > > cvg.view <- Views(cvg, as(toshow, "RangesList")) > > Error in .Method(..., FUN = FUN, MoreArgs = MoreArgs, SIMPLIFY = > SIMPLIFY, > > : > > all object lengths must be multiple of longest object length > > > > > The problem here is likely that "toshow" is on a wider universe (set of > sequences) than your coverage vector from the BAM file. You can use > keepSeqlevels(toshow, seqlevels(ss)). You want to make sure the seqlevels > are in the same order; not sure if keepSeqlevels() does this -- perhaps it > should? > > > > Also, I have been unsuccessful in exporting the RleViewsList to a file > > (wig or bedGraph format). Could you give a hint on this as well? > > Thanks again. > > > > > Now that I think about it, Views are a bit of an overkill for this specific > case. You could just do a > > cvg.select <- seqselect(cvg, as(toshow, "RangesList")) > > And then export that: > > export(cvg.select, "coverage.bedGraph") > > Jason > > > > > > On Wed, Nov 23, 2011 at 8:20 AM, Michael Lawrence > > <lawrence.michael@gene.com> wrote: > > > > > > > > > On Tue, Nov 22, 2011 at 9:39 AM, Jason Lu <jasonlu68@gmail.com> wrote: > > >> > > >> Hi list, > > >> > > >> I have a question and would like to get your help. > > >> What I would like to get is to show the read coverage in the exon > > >> regions (only) in the ucsc genome browser. My question is how to > > >> export the Views so I can load into ucsc browser. > > >> Here is what I have: > > >> > > >> # > > >> >cvg = coverage(ranges(ss)) > > > > > > Careful here: you are not distinguishing chromosomes when calculating > > this > > > coverage. It's all going into one vector. Instead, you want to call > > coverage > > > directly on your GRanges 'ss'. > > > > > > cvg <- coverage(ss) > > > > > >> > > >> >cvg.view = Views(cvg,rangesexon.gr)) # exon.gr is a GRanges > > > > > > Since 'cvg' is now an RleList (with one element per chromosome), you > > want a > > > RangesList to parallel it in your RleViewsList. > > > > > > cvg.view <- Views(cvg, asexon.gr, "RangesList")) > > > > > > I think we should try to make that easier and have Views() take a > > GRanges. > > > Will make a note. > > > > > >> > > >> >cvg.view > > >> Views on a 22109556-length Rle subject > > >> > > >> views: > > >> start end width > > >> [1] 22109317 22109688 372 [5 5 7 5 6 6 6 5 5 5 5 5 4 4 5 8 8 8 9 9 > 9 > > 9 > > >> ...] > > >> [2] 22084156 22084276 121 [4 4 4 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 > 4 > > 4 > > >> ...] > > >> [3] 22083039 22083195 157 [2 1 1 0 0 0 1 1 1 1 2 2 2 2 2 2 2 2 2 3 > 3 > > 4 > > >> ...] > > >> [4] 22079485 22079612 128 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 2 > 2 > > 2 > > >> ...] > > >> [5] 22079020 22079144 125 [6 6 6 6 7 7 7 7 7 7 7 7 7 7 6 6 6 6 6 6 > 5 > > 4 > > >> ...] > > >> > > >> I don't know how to modify this view such as adding chromosome info > > >> and export this 'cvg.view' in rtracklayer. > > > > > > > > > You can now export this RleViewsList directly to a file for upload. For > > > upload to the UCSC browser, how about: > > > > > > session <- browserSession() > > > session$coverage <- cvg.view > > > browserView(session, exon.gr[1]) > > > > > > That should show you the first exon. > > > > > >> Thanks in advance. > > >> Jason > > >> > > >> > sessionInfo() > > >> R version 2.14.0 (2011-10-31) > > >> Platform: x86_64-unknown-linux-gnu (64-bit) > > >> > > >> locale: > > >> [1] C > > >> > > >> attached base packages: > > >> [1] stats graphics grDevices utils datasets methods base > > >> > > >> other attached packages: > > >> [1] rtracklayer_1.14.3 RCurl_1.7-0 bitops_1.0-4.1 > > >> [4] Rsamtools_1.6.2 Biostrings_2.22.0 GenomicFeatures_1.6.4 > > >> [7] AnnotationDbi_1.16.4 Biobase_2.14.0 GenomicRanges_1.6.3 > > >> [10] IRanges_1.12.2 BiocInstaller_1.2.1 > > >> > > >> loaded via a namespace (and not attached): > > >> [1] BSgenome_1.22.0 DBI_0.2-5 RSQLite_0.10.0 XML_3.4-3 > > >> [5] biomaRt_2.10.0 tools_2.14.0 zlibbioc_1.0.0 > > >> > > > >> > > >> _______________________________________________ > > >> Bioconductor mailing list > > >> Bioconductor@r-project.org > > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > > >> Search the archives: > > >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- If people do not believe that mathematics is simple, it is only because they do not realize how complicated life is. John von Neumann<http: www-groups.dcs.st-="" and.ac.uk="" ~history="" biographies="" von_neumann.html=""> [[alternative HTML version deleted]]
ADD COMMENTlink modified 5.4 years ago by Michael Lawrence9.1k • written 5.4 years ago by Tim Triche4.1k
0
gravatar for Michael Lawrence
5.4 years ago by
United States
Michael Lawrence9.1k wrote:
On Wed, Nov 23, 2011 at 11:58 AM, Tim Triche, Jr. <tim.triche@gmail.com>wrote: > not entirely related, but not wholly unrelated; some things in rtracklayer > I've been wondering about: > > 1) I try and import a WIG file and it takes over 24 hours. This seems > like a Bad Thing, what am I doing wrong? > Does it eventually finish? How big is the WIG file? Could you point me one to test? Also, would you please provide your sesionInfo()? > 2) How should I import .vstep tracks? > Haven't heard of these. Are these just variable step WIG? If so, just use the WIG import. > 3) Sometimes a BED file throws an error, what is the best way to debug > this? import.ucsc(filename, subformat='bed')? > > Maybe trace(import.bed, browser, sig="connection"). > and a GenomicRanges question: > > 1) suppose I have a large pile of segmented copy number calls. what's the > best way to get a smoothed "pileup" of them? > > thanks much! for the tools and for the guidance. > > > > > > > On Wed, Nov 23, 2011 at 8:38 AM, Michael Lawrence < > lawrence.michael@gene.com> wrote: > >> On Wed, Nov 23, 2011 at 7:18 AM, Jason Lu <jasonlu68@gmail.com> wrote: >> >> > I want to thank Michael and Martin for being helpful. >> > >> > My apology for not showing the full script last email. The purpose >> > here is to get a figure showing read coverage in the exon regions of >> > my gene of interest (one gene). I tried some additional steps here and >> > get an error below. >> > >> > # >> > nm = "NM_000997" >> > txs = exonsBy(txdb,by="tx",use.name=TRUE) >> > toshow = txs[[nm]] >> > ss <- >> > readBamGappedAlignments("myalign.bam",param=ScanBamParam(which=to show)) >> > strand(ss) = "*" >> > cvg = coverage(ss) # cvg: SimpleRleList >> > >> > cvg.view <- Views(cvg, as(toshow, "RangesList")) >> > Error in .Method(..., FUN = FUN, MoreArgs = MoreArgs, SIMPLIFY = >> SIMPLIFY, >> > : >> > all object lengths must be multiple of longest object length >> > >> > >> The problem here is likely that "toshow" is on a wider universe (set of >> sequences) than your coverage vector from the BAM file. You can use >> keepSeqlevels(toshow, seqlevels(ss)). You want to make sure the seqlevels >> are in the same order; not sure if keepSeqlevels() does this -- perhaps it >> should? >> >> >> > Also, I have been unsuccessful in exporting the RleViewsList to a file >> > (wig or bedGraph format). Could you give a hint on this as well? >> > Thanks again. >> > >> > >> Now that I think about it, Views are a bit of an overkill for this >> specific >> case. You could just do a >> >> cvg.select <- seqselect(cvg, as(toshow, "RangesList")) >> >> And then export that: >> >> export(cvg.select, "coverage.bedGraph") >> >> Jason >> > >> > >> > On Wed, Nov 23, 2011 at 8:20 AM, Michael Lawrence >> > <lawrence.michael@gene.com> wrote: >> > > >> > > >> > > On Tue, Nov 22, 2011 at 9:39 AM, Jason Lu <jasonlu68@gmail.com> >> wrote: >> > >> >> > >> Hi list, >> > >> >> > >> I have a question and would like to get your help. >> > >> What I would like to get is to show the read coverage in the exon >> > >> regions (only) in the ucsc genome browser. My question is how to >> > >> export the Views so I can load into ucsc browser. >> > >> Here is what I have: >> > >> >> > >> # >> > >> >cvg = coverage(ranges(ss)) >> > > >> > > Careful here: you are not distinguishing chromosomes when calculating >> > this >> > > coverage. It's all going into one vector. Instead, you want to call >> > coverage >> > > directly on your GRanges 'ss'. >> > > >> > > cvg <- coverage(ss) >> > > >> > >> >> > >> >cvg.view = Views(cvg,rangesexon.gr)) # exon.gr is a GRanges >> > > >> > > Since 'cvg' is now an RleList (with one element per chromosome), you >> > want a >> > > RangesList to parallel it in your RleViewsList. >> > > >> > > cvg.view <- Views(cvg, asexon.gr, "RangesList")) >> > > >> > > I think we should try to make that easier and have Views() take a >> > GRanges. >> > > Will make a note. >> > > >> > >> >> > >> >cvg.view >> > >> Views on a 22109556-length Rle subject >> > >> >> > >> views: >> > >> start end width >> > >> [1] 22109317 22109688 372 [5 5 7 5 6 6 6 5 5 5 5 5 4 4 5 8 8 8 9 >> 9 9 >> > 9 >> > >> ...] >> > >> [2] 22084156 22084276 121 [4 4 4 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 >> 4 4 >> > 4 >> > >> ...] >> > >> [3] 22083039 22083195 157 [2 1 1 0 0 0 1 1 1 1 2 2 2 2 2 2 2 2 2 >> 3 3 >> > 4 >> > >> ...] >> > >> [4] 22079485 22079612 128 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 >> 2 2 >> > 2 >> > >> ...] >> > >> [5] 22079020 22079144 125 [6 6 6 6 7 7 7 7 7 7 7 7 7 7 6 6 6 6 6 >> 6 5 >> > 4 >> > >> ...] >> > >> >> > >> I don't know how to modify this view such as adding chromosome info >> > >> and export this 'cvg.view' in rtracklayer. >> > > >> > > >> > > You can now export this RleViewsList directly to a file for upload. >> For >> > > upload to the UCSC browser, how about: >> > > >> > > session <- browserSession() >> > > session$coverage <- cvg.view >> > > browserView(session, exon.gr[1]) >> > > >> > > That should show you the first exon. >> > > >> > >> Thanks in advance. >> > >> Jason >> > >> >> > >> > sessionInfo() >> > >> R version 2.14.0 (2011-10-31) >> > >> Platform: x86_64-unknown-linux-gnu (64-bit) >> > >> >> > >> locale: >> > >> [1] C >> > >> >> > >> attached base packages: >> > >> [1] stats graphics grDevices utils datasets methods base >> > >> >> > >> other attached packages: >> > >> [1] rtracklayer_1.14.3 RCurl_1.7-0 bitops_1.0-4.1 >> > >> [4] Rsamtools_1.6.2 Biostrings_2.22.0 >> GenomicFeatures_1.6.4 >> > >> [7] AnnotationDbi_1.16.4 Biobase_2.14.0 GenomicRanges_1.6.3 >> > >> [10] IRanges_1.12.2 BiocInstaller_1.2.1 >> > >> >> > >> loaded via a namespace (and not attached): >> > >> [1] BSgenome_1.22.0 DBI_0.2-5 RSQLite_0.10.0 XML_3.4-3 >> > >> [5] biomaRt_2.10.0 tools_2.14.0 zlibbioc_1.0.0 >> > >> > >> > >> >> > >> _______________________________________________ >> > >> Bioconductor mailing list >> > >> Bioconductor@r-project.org >> > >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> > >> Search the archives: >> > >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > >> > > >> > >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > > -- > If people do not believe that mathematics is simple, > it is only because they do not realize how complicated life is. John von > Neumann<http: www-groups.dcs.st-="" and.ac.uk="" %7ehistory="" biographies="" von_neumann.html=""> > [[alternative HTML version deleted]]
ADD COMMENTlink written 5.4 years ago by Michael Lawrence9.1k
On Wed, Nov 23, 2011 at 1:48 PM, Michael Lawrence <lawrence.michael@gene.com> wrote: > > > On Wed, Nov 23, 2011 at 11:58 AM, Tim Triche, Jr. <tim.triche@gmail.com>wrote: > >> not entirely related, but not wholly unrelated; some things in >> rtracklayer I've been wondering about: >> >> 1) I try and import a WIG file and it takes over 24 hours. This seems >> like a Bad Thing, what am I doing wrong? >> > > Does it eventually finish? How big is the WIG file? Could you point me one > to test? Also, would you please provide your sesionInfo()? > I killed the import after 36 hours and 51GB of RAM. Here's the file I was working on: ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/samples/GSM621nnn/GS M621665/GSM621665_BI.Mobilized_CD34_Primary_Cells.H3K4me3.UW_RO_01536. wig.gz Numerous other files of interest are up on the Epigenome Roadmap site: http://www.ncbi.nlm.nih.gov/geo/roadmap/epigenomics/ This seems to be where I obtain most of my woes. (My SessionInfo() is appended to the end of the email.) > >> 2) How should I import .vstep tracks? >> > > Haven't heard of these. Are these just variable step WIG? If so, just use > the WIG import. > They are, sort of, but they are supposedly BED format. Here are several: http://dir.nhlbi.nih.gov/papers/lmi/epigenomes/hghscmethylation.aspx > > >> 3) Sometimes a BED file throws an error, what is the best way to debug >> this? import.ucsc(filename, subformat='bed')? >> >> > Maybe trace(import.bed, browser, sig="connection"). > Will try this and post a reply. SessionInfo: > sessionInfo() R version 2.14.0 (2011-10-31) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] rtracklayer_1.14.2 RCurl_1.7-0 bitops_1.0-4.1 [4] gtools_2.6.2 BiocInstaller_1.2.1 reshape_0.8.4 [7] plyr_1.6 loaded via a namespace (and not attached): [1] Biostrings_2.22.0 BSgenome_1.22.0 GenomicRanges_1.6.2 [4] IRanges_1.12.1 tools_2.14.0 XML_3.2-0 [7] zlibbioc_1.0.0 > > >> and a GenomicRanges question: >> >> 1) suppose I have a large pile of segmented copy number calls. what's >> the best way to get a smoothed "pileup" of them? >> >> thanks much! for the tools and for the guidance. >> >> >> >> >> >> >> On Wed, Nov 23, 2011 at 8:38 AM, Michael Lawrence < >> lawrence.michael@gene.com> wrote: >> >>> On Wed, Nov 23, 2011 at 7:18 AM, Jason Lu <jasonlu68@gmail.com> wrote: >>> >>> > I want to thank Michael and Martin for being helpful. >>> > >>> > My apology for not showing the full script last email. The purpose >>> > here is to get a figure showing read coverage in the exon regions of >>> > my gene of interest (one gene). I tried some additional steps here and >>> > get an error below. >>> > >>> > # >>> > nm = "NM_000997" >>> > txs = exonsBy(txdb,by="tx",use.name=TRUE) >>> > toshow = txs[[nm]] >>> > ss <- >>> > readBamGappedAlignments("myalign.bam",param=ScanBamParam(which=t oshow)) >>> > strand(ss) = "*" >>> > cvg = coverage(ss) # cvg: SimpleRleList >>> > >>> > cvg.view <- Views(cvg, as(toshow, "RangesList")) >>> > Error in .Method(..., FUN = FUN, MoreArgs = MoreArgs, SIMPLIFY = >>> SIMPLIFY, >>> > : >>> > all object lengths must be multiple of longest object length >>> > >>> > >>> The problem here is likely that "toshow" is on a wider universe (set of >>> sequences) than your coverage vector from the BAM file. You can use >>> keepSeqlevels(toshow, seqlevels(ss)). You want to make sure the seqlevels >>> are in the same order; not sure if keepSeqlevels() does this -- perhaps >>> it >>> should? >>> >>> >>> > Also, I have been unsuccessful in exporting the RleViewsList to a file >>> > (wig or bedGraph format). Could you give a hint on this as well? >>> > Thanks again. >>> > >>> > >>> Now that I think about it, Views are a bit of an overkill for this >>> specific >>> case. You could just do a >>> >>> cvg.select <- seqselect(cvg, as(toshow, "RangesList")) >>> >>> And then export that: >>> >>> export(cvg.select, "coverage.bedGraph") >>> >>> Jason >>> > >>> > >>> > On Wed, Nov 23, 2011 at 8:20 AM, Michael Lawrence >>> > <lawrence.michael@gene.com> wrote: >>> > > >>> > > >>> > > On Tue, Nov 22, 2011 at 9:39 AM, Jason Lu <jasonlu68@gmail.com> >>> wrote: >>> > >> >>> > >> Hi list, >>> > >> >>> > >> I have a question and would like to get your help. >>> > >> What I would like to get is to show the read coverage in the exon >>> > >> regions (only) in the ucsc genome browser. My question is how to >>> > >> export the Views so I can load into ucsc browser. >>> > >> Here is what I have: >>> > >> >>> > >> # >>> > >> >cvg = coverage(ranges(ss)) >>> > > >>> > > Careful here: you are not distinguishing chromosomes when calculating >>> > this >>> > > coverage. It's all going into one vector. Instead, you want to call >>> > coverage >>> > > directly on your GRanges 'ss'. >>> > > >>> > > cvg <- coverage(ss) >>> > > >>> > >> >>> > >> >cvg.view = Views(cvg,rangesexon.gr)) # exon.gr is a GRanges >>> > > >>> > > Since 'cvg' is now an RleList (with one element per chromosome), you >>> > want a >>> > > RangesList to parallel it in your RleViewsList. >>> > > >>> > > cvg.view <- Views(cvg, asexon.gr, "RangesList")) >>> > > >>> > > I think we should try to make that easier and have Views() take a >>> > GRanges. >>> > > Will make a note. >>> > > >>> > >> >>> > >> >cvg.view >>> > >> Views on a 22109556-length Rle subject >>> > >> >>> > >> views: >>> > >> start end width >>> > >> [1] 22109317 22109688 372 [5 5 7 5 6 6 6 5 5 5 5 5 4 4 5 8 8 8 9 >>> 9 9 >>> > 9 >>> > >> ...] >>> > >> [2] 22084156 22084276 121 [4 4 4 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 >>> 4 4 >>> > 4 >>> > >> ...] >>> > >> [3] 22083039 22083195 157 [2 1 1 0 0 0 1 1 1 1 2 2 2 2 2 2 2 2 2 >>> 3 3 >>> > 4 >>> > >> ...] >>> > >> [4] 22079485 22079612 128 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 >>> 2 2 >>> > 2 >>> > >> ...] >>> > >> [5] 22079020 22079144 125 [6 6 6 6 7 7 7 7 7 7 7 7 7 7 6 6 6 6 6 >>> 6 5 >>> > 4 >>> > >> ...] >>> > >> >>> > >> I don't know how to modify this view such as adding chromosome info >>> > >> and export this 'cvg.view' in rtracklayer. >>> > > >>> > > >>> > > You can now export this RleViewsList directly to a file for upload. >>> For >>> > > upload to the UCSC browser, how about: >>> > > >>> > > session <- browserSession() >>> > > session$coverage <- cvg.view >>> > > browserView(session, exon.gr[1]) >>> > > >>> > > That should show you the first exon. >>> > > >>> > >> Thanks in advance. >>> > >> Jason >>> > >> >>> > >> > sessionInfo() >>> > >> R version 2.14.0 (2011-10-31) >>> > >> Platform: x86_64-unknown-linux-gnu (64-bit) >>> > >> >>> > >> locale: >>> > >> [1] C >>> > >> >>> > >> attached base packages: >>> > >> [1] stats graphics grDevices utils datasets methods base >>> > >> >>> > >> other attached packages: >>> > >> [1] rtracklayer_1.14.3 RCurl_1.7-0 bitops_1.0-4.1 >>> > >> [4] Rsamtools_1.6.2 Biostrings_2.22.0 >>> GenomicFeatures_1.6.4 >>> > >> [7] AnnotationDbi_1.16.4 Biobase_2.14.0 GenomicRanges_1.6.3 >>> > >> [10] IRanges_1.12.2 BiocInstaller_1.2.1 >>> > >> >>> > >> loaded via a namespace (and not attached): >>> > >> [1] BSgenome_1.22.0 DBI_0.2-5 RSQLite_0.10.0 XML_3.4-3 >>> > >> [5] biomaRt_2.10.0 tools_2.14.0 zlibbioc_1.0.0 >>> > >> > >>> > >> >>> > >> _______________________________________________ >>> > >> Bioconductor mailing list >>> > >> Bioconductor@r-project.org >>> > >> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> > >> Search the archives: >>> > >> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> > > >>> > > >>> > >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> >> >> -- >> If people do not believe that mathematics is simple, >> it is only because they do not realize how complicated life is. John von >> Neumann<http: www-groups.dcs.st-="" and.ac.uk="" %7ehistory="" biographies="" von_neumann.html=""> >> > > -- If people do not believe that mathematics is simple, it is only because they do not realize how complicated life is. John von Neumann<http: www-groups.dcs.st-="" and.ac.uk="" ~history="" biographies="" von_neumann.html=""> [[alternative HTML version deleted]]
ADD REPLYlink written 5.4 years ago by Tim Triche4.1k
On Wed, Nov 23, 2011 at 3:08 PM, Tim Triche, Jr. <tim.triche@gmail.com>wrote: > On Wed, Nov 23, 2011 at 1:48 PM, Michael Lawrence < > lawrence.michael@gene.com > > wrote: > > > > > > > On Wed, Nov 23, 2011 at 11:58 AM, Tim Triche, Jr. <tim.triche@gmail.com> >wrote: > > > >> not entirely related, but not wholly unrelated; some things in > >> rtracklayer I've been wondering about: > >> > >> 1) I try and import a WIG file and it takes over 24 hours. This seems > >> like a Bad Thing, what am I doing wrong? > >> > > > > Does it eventually finish? How big is the WIG file? Could you point me > one > > to test? Also, would you please provide your sesionInfo()? > > > > I killed the import after 36 hours and 51GB of RAM. Here's the file I was > working on: > > > ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/samples/GSM621nnn/ GSM621665/GSM621665_BI.Mobilized_CD34_Primary_Cells.H3K4me3.UW_RO_0153 6.wig.gz > > Numerous other files of interest are up on the Epigenome Roadmap site: > > http://www.ncbi.nlm.nih.gov/geo/roadmap/epigenomics/ > > This seems to be where I obtain most of my woes. > > (My SessionInfo() is appended to the end of the email.) > > Thanks, I will look into this. > > > > > >> 2) How should I import .vstep tracks? > >> > > > > Haven't heard of these. Are these just variable step WIG? If so, just use > > the WIG import. > > > > They are, sort of, but they are supposedly BED format. Here are several: > > http://dir.nhlbi.nih.gov/papers/lmi/epigenomes/hghscmethylation.aspx > > > I looked at them. They are WIG. > > > > > > >> 3) Sometimes a BED file throws an error, what is the best way to debug > >> this? import.ucsc(filename, subformat='bed')? > >> > >> > > Maybe trace(import.bed, browser, sig="connection"). > > > > Will try this and post a reply. > > SessionInfo: > > > > sessionInfo() > R version 2.14.0 (2011-10-31) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices datasets utils methods base > > other attached packages: > [1] rtracklayer_1.14.2 RCurl_1.7-0 bitops_1.0-4.1 > [4] gtools_2.6.2 BiocInstaller_1.2.1 reshape_0.8.4 > [7] plyr_1.6 > > loaded via a namespace (and not attached): > [1] Biostrings_2.22.0 BSgenome_1.22.0 GenomicRanges_1.6.2 > [4] IRanges_1.12.1 tools_2.14.0 XML_3.2-0 > [7] zlibbioc_1.0.0 > > > > > > > > > > > >> and a GenomicRanges question: > >> > >> 1) suppose I have a large pile of segmented copy number calls. what's > >> the best way to get a smoothed "pileup" of them? > >> > >> thanks much! for the tools and for the guidance. > >> > >> > >> > >> > >> > >> > >> On Wed, Nov 23, 2011 at 8:38 AM, Michael Lawrence < > >> lawrence.michael@gene.com> wrote: > >> > >>> On Wed, Nov 23, 2011 at 7:18 AM, Jason Lu <jasonlu68@gmail.com> wrote: > >>> > >>> > I want to thank Michael and Martin for being helpful. > >>> > > >>> > My apology for not showing the full script last email. The purpose > >>> > here is to get a figure showing read coverage in the exon regions of > >>> > my gene of interest (one gene). I tried some additional steps here > and > >>> > get an error below. > >>> > > >>> > # > >>> > nm = "NM_000997" > >>> > txs = exonsBy(txdb,by="tx",use.name=TRUE) > >>> > toshow = txs[[nm]] > >>> > ss <- > >>> > > readBamGappedAlignments("myalign.bam",param=ScanBamParam(which=tosho w)) > >>> > strand(ss) = "*" > >>> > cvg = coverage(ss) # cvg: SimpleRleList > >>> > > >>> > cvg.view <- Views(cvg, as(toshow, "RangesList")) > >>> > Error in .Method(..., FUN = FUN, MoreArgs = MoreArgs, SIMPLIFY = > >>> SIMPLIFY, > >>> > : > >>> > all object lengths must be multiple of longest object length > >>> > > >>> > > >>> The problem here is likely that "toshow" is on a wider universe (set of > >>> sequences) than your coverage vector from the BAM file. You can use > >>> keepSeqlevels(toshow, seqlevels(ss)). You want to make sure the > seqlevels > >>> are in the same order; not sure if keepSeqlevels() does this -- perhaps > >>> it > >>> should? > >>> > >>> > >>> > Also, I have been unsuccessful in exporting the RleViewsList to a > file > >>> > (wig or bedGraph format). Could you give a hint on this as well? > >>> > Thanks again. > >>> > > >>> > > >>> Now that I think about it, Views are a bit of an overkill for this > >>> specific > >>> case. You could just do a > >>> > >>> cvg.select <- seqselect(cvg, as(toshow, "RangesList")) > >>> > >>> And then export that: > >>> > >>> export(cvg.select, "coverage.bedGraph") > >>> > >>> Jason > >>> > > >>> > > >>> > On Wed, Nov 23, 2011 at 8:20 AM, Michael Lawrence > >>> > <lawrence.michael@gene.com> wrote: > >>> > > > >>> > > > >>> > > On Tue, Nov 22, 2011 at 9:39 AM, Jason Lu <jasonlu68@gmail.com> > >>> wrote: > >>> > >> > >>> > >> Hi list, > >>> > >> > >>> > >> I have a question and would like to get your help. > >>> > >> What I would like to get is to show the read coverage in the exon > >>> > >> regions (only) in the ucsc genome browser. My question is how to > >>> > >> export the Views so I can load into ucsc browser. > >>> > >> Here is what I have: > >>> > >> > >>> > >> # > >>> > >> >cvg = coverage(ranges(ss)) > >>> > > > >>> > > Careful here: you are not distinguishing chromosomes when > calculating > >>> > this > >>> > > coverage. It's all going into one vector. Instead, you want to call > >>> > coverage > >>> > > directly on your GRanges 'ss'. > >>> > > > >>> > > cvg <- coverage(ss) > >>> > > > >>> > >> > >>> > >> >cvg.view = Views(cvg,rangesexon.gr)) # exon.gr is a GRanges > >>> > > > >>> > > Since 'cvg' is now an RleList (with one element per chromosome), > you > >>> > want a > >>> > > RangesList to parallel it in your RleViewsList. > >>> > > > >>> > > cvg.view <- Views(cvg, asexon.gr, "RangesList")) > >>> > > > >>> > > I think we should try to make that easier and have Views() take a > >>> > GRanges. > >>> > > Will make a note. > >>> > > > >>> > >> > >>> > >> >cvg.view > >>> > >> Views on a 22109556-length Rle subject > >>> > >> > >>> > >> views: > >>> > >> start end width > >>> > >> [1] 22109317 22109688 372 [5 5 7 5 6 6 6 5 5 5 5 5 4 4 5 8 8 8 > 9 > >>> 9 9 > >>> > 9 > >>> > >> ...] > >>> > >> [2] 22084156 22084276 121 [4 4 4 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 > 4 > >>> 4 4 > >>> > 4 > >>> > >> ...] > >>> > >> [3] 22083039 22083195 157 [2 1 1 0 0 0 1 1 1 1 2 2 2 2 2 2 2 2 > 2 > >>> 3 3 > >>> > 4 > >>> > >> ...] > >>> > >> [4] 22079485 22079612 128 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 > 2 > >>> 2 2 > >>> > 2 > >>> > >> ...] > >>> > >> [5] 22079020 22079144 125 [6 6 6 6 7 7 7 7 7 7 7 7 7 7 6 6 6 6 > 6 > >>> 6 5 > >>> > 4 > >>> > >> ...] > >>> > >> > >>> > >> I don't know how to modify this view such as adding chromosome > info > >>> > >> and export this 'cvg.view' in rtracklayer. > >>> > > > >>> > > > >>> > > You can now export this RleViewsList directly to a file for upload. > >>> For > >>> > > upload to the UCSC browser, how about: > >>> > > > >>> > > session <- browserSession() > >>> > > session$coverage <- cvg.view > >>> > > browserView(session, exon.gr[1]) > >>> > > > >>> > > That should show you the first exon. > >>> > > > >>> > >> Thanks in advance. > >>> > >> Jason > >>> > >> > >>> > >> > sessionInfo() > >>> > >> R version 2.14.0 (2011-10-31) > >>> > >> Platform: x86_64-unknown-linux-gnu (64-bit) > >>> > >> > >>> > >> locale: > >>> > >> [1] C > >>> > >> > >>> > >> attached base packages: > >>> > >> [1] stats graphics grDevices utils datasets methods > base > >>> > >> > >>> > >> other attached packages: > >>> > >> [1] rtracklayer_1.14.3 RCurl_1.7-0 bitops_1.0-4.1 > >>> > >> [4] Rsamtools_1.6.2 Biostrings_2.22.0 > >>> GenomicFeatures_1.6.4 > >>> > >> [7] AnnotationDbi_1.16.4 Biobase_2.14.0 > GenomicRanges_1.6.3 > >>> > >> [10] IRanges_1.12.2 BiocInstaller_1.2.1 > >>> > >> > >>> > >> loaded via a namespace (and not attached): > >>> > >> [1] BSgenome_1.22.0 DBI_0.2-5 RSQLite_0.10.0 XML_3.4-3 > >>> > >> [5] biomaRt_2.10.0 tools_2.14.0 zlibbioc_1.0.0 > >>> > >> > > >>> > >> > >>> > >> _______________________________________________ > >>> > >> Bioconductor mailing list > >>> > >> Bioconductor@r-project.org > >>> > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>> > >> Search the archives: > >>> > >> http://news.gmane.org/gmane.science.biology.informatics.conductor > >>> > > > >>> > > > >>> > > >>> > >>> [[alternative HTML version deleted]] > >>> > >>> _______________________________________________ > >>> Bioconductor mailing list > >>> Bioconductor@r-project.org > >>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>> Search the archives: > >>> http://news.gmane.org/gmane.science.biology.informatics.conductor > >>> > >> > >> > >> > >> -- > >> If people do not believe that mathematics is simple, > >> it is only because they do not realize how complicated life is. John von > >> Neumann< > http://www-groups.dcs.st- and.ac.uk/%7Ehistory/Biographies/Von_Neumann.html > > > >> > > > > > > > -- > If people do not believe that mathematics is simple, > it is only because they do not realize how complicated life is. John von > Neumann< > http://www-groups.dcs.st- and.ac.uk/~history/Biographies/Von_Neumann.html> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLYlink written 5.4 years ago by Michael Lawrence9.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 296 users visited in the last hour