Search
Question: Some more rtracklayer questions (re: WIG and .VSTEP files)
0
7.0 years ago by
Tim Triche4.2k
United States
Tim Triche4.2k wrote:
not entirely related, but not wholly unrelated; some things in rtracklayer I've been wondering about: 1) I try and import a WIG file and it takes over 24 hours. This seems like a Bad Thing, what am I doing wrong? 2) How should I import .vstep tracks? 3) Sometimes a BED file throws an error, what is the best way to debug this? import.ucsc(filename, subformat='bed')? and a GenomicRanges question: 1) suppose I have a large pile of segmented copy number calls. what's the best way to get a smoothed "pileup" of them? thanks much! for the tools and for the guidance. On Wed, Nov 23, 2011 at 8:38 AM, Michael Lawrence <lawrence.michael@gene.com> wrote: > On Wed, Nov 23, 2011 at 7:18 AM, Jason Lu <jasonlu68@gmail.com> wrote: > > > I want to thank Michael and Martin for being helpful. > > > > My apology for not showing the full script last email. The purpose > > here is to get a figure showing read coverage in the exon regions of > > my gene of interest (one gene). I tried some additional steps here and > > get an error below. > > > > # > > nm = "NM_000997" > > txs = exonsBy(txdb,by="tx",use.name=TRUE) > > toshow = txs[[nm]] > > ss <- > > readBamGappedAlignments("myalign.bam",param=ScanBamParam(which=tos how)) > > strand(ss) = "*" > > cvg = coverage(ss) # cvg: SimpleRleList > > > > cvg.view <- Views(cvg, as(toshow, "RangesList")) > > Error in .Method(..., FUN = FUN, MoreArgs = MoreArgs, SIMPLIFY = > SIMPLIFY, > > : > > all object lengths must be multiple of longest object length > > > > > The problem here is likely that "toshow" is on a wider universe (set of > sequences) than your coverage vector from the BAM file. You can use > keepSeqlevels(toshow, seqlevels(ss)). You want to make sure the seqlevels > are in the same order; not sure if keepSeqlevels() does this -- perhaps it > should? > > > > Also, I have been unsuccessful in exporting the RleViewsList to a file > > (wig or bedGraph format). Could you give a hint on this as well? > > Thanks again. > > > > > Now that I think about it, Views are a bit of an overkill for this specific > case. You could just do a > > cvg.select <- seqselect(cvg, as(toshow, "RangesList")) > > And then export that: > > export(cvg.select, "coverage.bedGraph") > > Jason > > > > > > On Wed, Nov 23, 2011 at 8:20 AM, Michael Lawrence > > <lawrence.michael@gene.com> wrote: > > > > > > > > > On Tue, Nov 22, 2011 at 9:39 AM, Jason Lu <jasonlu68@gmail.com> wrote: > > >> > > >> Hi list, > > >> > > >> I have a question and would like to get your help. > > >> What I would like to get is to show the read coverage in the exon > > >> regions (only) in the ucsc genome browser. My question is how to > > >> export the Views so I can load into ucsc browser. > > >> Here is what I have: > > >> > > >> # > > >> >cvg = coverage(ranges(ss)) > > > > > > Careful here: you are not distinguishing chromosomes when calculating > > this > > > coverage. It's all going into one vector. Instead, you want to call > > coverage > > > directly on your GRanges 'ss'. > > > > > > cvg <- coverage(ss) > > > > > >> > > >> >cvg.view = Views(cvg,rangesexon.gr)) # exon.gr is a GRanges > > > > > > Since 'cvg' is now an RleList (with one element per chromosome), you > > want a > > > RangesList to parallel it in your RleViewsList. > > > > > > cvg.view <- Views(cvg, asexon.gr, "RangesList")) > > > > > > I think we should try to make that easier and have Views() take a > > GRanges. > > > Will make a note. > > > > > >> > > >> >cvg.view > > >> Views on a 22109556-length Rle subject > > >> > > >> views: > > >> start end width > > >> [1] 22109317 22109688 372 [5 5 7 5 6 6 6 5 5 5 5 5 4 4 5 8 8 8 9 9 > 9 > > 9 > > >> ...] > > >> [2] 22084156 22084276 121 [4 4 4 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 > 4 > > 4 > > >> ...] > > >> [3] 22083039 22083195 157 [2 1 1 0 0 0 1 1 1 1 2 2 2 2 2 2 2 2 2 3 > 3 > > 4 > > >> ...] > > >> [4] 22079485 22079612 128 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 2 > 2 > > 2 > > >> ...] > > >> [5] 22079020 22079144 125 [6 6 6 6 7 7 7 7 7 7 7 7 7 7 6 6 6 6 6 6 > 5 > > 4 > > >> ...] > > >> > > >> I don't know how to modify this view such as adding chromosome info > > >> and export this 'cvg.view' in rtracklayer. > > > > > > > > > You can now export this RleViewsList directly to a file for upload. For > > > upload to the UCSC browser, how about: > > > > > > session <- browserSession() > > > sessioncoverage <- cvg.view > > > browserView(session, exon.gr[1]) > > > > > > That should show you the first exon. > > > > > >> Thanks in advance. > > >> Jason > > >> > > >> > sessionInfo() > > >> R version 2.14.0 (2011-10-31) > > >> Platform: x86_64-unknown-linux-gnu (64-bit) > > >> > > >> locale: > > >> [1] C > > >> > > >> attached base packages: > > >> [1] stats graphics grDevices utils datasets methods base > > >> > > >> other attached packages: > > >> [1] rtracklayer_1.14.3 RCurl_1.7-0 bitops_1.0-4.1 > > >> [4] Rsamtools_1.6.2 Biostrings_2.22.0 GenomicFeatures_1.6.4 > > >> [7] AnnotationDbi_1.16.4 Biobase_2.14.0 GenomicRanges_1.6.3 > > >> [10] IRanges_1.12.2 BiocInstaller_1.2.1 > > >> > > >> loaded via a namespace (and not attached): > > >> [1] BSgenome_1.22.0 DBI_0.2-5 RSQLite_0.10.0 XML_3.4-3 > > >> [5] biomaRt_2.10.0 tools_2.14.0 zlibbioc_1.0.0 > > >> > > > >> > > >> _______________________________________________ > > >> Bioconductor mailing list > > >> Bioconductor@r-project.org > > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > > >> Search the archives: > > >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- If people do not believe that mathematics is simple, it is only because they do not realize how complicated life is. John von Neumann<http: www-groups.dcs.st-="" and.ac.uk="" ~history="" biographies="" von_neumann.html=""> [[alternative HTML version deleted]] ADD COMMENTlink modified 7.0 years ago by Michael Lawrence10k • written 7.0 years ago by Tim Triche4.2k 0 7.0 years ago by United States Michael Lawrence10k wrote: On Wed, Nov 23, 2011 at 11:58 AM, Tim Triche, Jr. <tim.triche@gmail.com>wrote: > not entirely related, but not wholly unrelated; some things in rtracklayer > I've been wondering about: > > 1) I try and import a WIG file and it takes over 24 hours. This seems > like a Bad Thing, what am I doing wrong? > Does it eventually finish? How big is the WIG file? Could you point me one to test? Also, would you please provide your sesionInfo()? > 2) How should I import .vstep tracks? > Haven't heard of these. Are these just variable step WIG? If so, just use the WIG import. > 3) Sometimes a BED file throws an error, what is the best way to debug > this? import.ucsc(filename, subformat='bed')? > > Maybe trace(import.bed, browser, sig="connection"). > and a GenomicRanges question: > > 1) suppose I have a large pile of segmented copy number calls. what's the > best way to get a smoothed "pileup" of them? > > thanks much! for the tools and for the guidance. > > > > > > > On Wed, Nov 23, 2011 at 8:38 AM, Michael Lawrence < > lawrence.michael@gene.com> wrote: > >> On Wed, Nov 23, 2011 at 7:18 AM, Jason Lu <jasonlu68@gmail.com> wrote: >> >> > I want to thank Michael and Martin for being helpful. >> > >> > My apology for not showing the full script last email. The purpose >> > here is to get a figure showing read coverage in the exon regions of >> > my gene of interest (one gene). I tried some additional steps here and >> > get an error below. >> > >> > # >> > nm = "NM_000997" >> > txs = exonsBy(txdb,by="tx",use.name=TRUE) >> > toshow = txs[[nm]] >> > ss <- >> > readBamGappedAlignments("myalign.bam",param=ScanBamParam(which=to show)) >> > strand(ss) = "*" >> > cvg = coverage(ss) # cvg: SimpleRleList >> > >> > cvg.view <- Views(cvg, as(toshow, "RangesList")) >> > Error in .Method(..., FUN = FUN, MoreArgs = MoreArgs, SIMPLIFY = >> SIMPLIFY, >> > : >> > all object lengths must be multiple of longest object length >> > >> > >> The problem here is likely that "toshow" is on a wider universe (set of >> sequences) than your coverage vector from the BAM file. You can use >> keepSeqlevels(toshow, seqlevels(ss)). You want to make sure the seqlevels >> are in the same order; not sure if keepSeqlevels() does this -- perhaps it >> should? >> >> >> > Also, I have been unsuccessful in exporting the RleViewsList to a file >> > (wig or bedGraph format). Could you give a hint on this as well? >> > Thanks again. >> > >> > >> Now that I think about it, Views are a bit of an overkill for this >> specific >> case. You could just do a >> >> cvg.select <- seqselect(cvg, as(toshow, "RangesList")) >> >> And then export that: >> >> export(cvg.select, "coverage.bedGraph") >> >> Jason >> > >> > >> > On Wed, Nov 23, 2011 at 8:20 AM, Michael Lawrence >> > <lawrence.michael@gene.com> wrote: >> > > >> > > >> > > On Tue, Nov 22, 2011 at 9:39 AM, Jason Lu <jasonlu68@gmail.com> >> wrote: >> > >> >> > >> Hi list, >> > >> >> > >> I have a question and would like to get your help. >> > >> What I would like to get is to show the read coverage in the exon >> > >> regions (only) in the ucsc genome browser. My question is how to >> > >> export the Views so I can load into ucsc browser. >> > >> Here is what I have: >> > >> >> > >> # >> > >> >cvg = coverage(ranges(ss)) >> > > >> > > Careful here: you are not distinguishing chromosomes when calculating >> > this >> > > coverage. It's all going into one vector. Instead, you want to call >> > coverage >> > > directly on your GRanges 'ss'. >> > > >> > > cvg <- coverage(ss) >> > > >> > >> >> > >> >cvg.view = Views(cvg,rangesexon.gr)) # exon.gr is a GRanges >> > > >> > > Since 'cvg' is now an RleList (with one element per chromosome), you >> > want a >> > > RangesList to parallel it in your RleViewsList. >> > > >> > > cvg.view <- Views(cvg, asexon.gr, "RangesList")) >> > > >> > > I think we should try to make that easier and have Views() take a >> > GRanges. >> > > Will make a note. >> > > >> > >> >> > >> >cvg.view >> > >> Views on a 22109556-length Rle subject >> > >> >> > >> views: >> > >> start end width >> > >> [1] 22109317 22109688 372 [5 5 7 5 6 6 6 5 5 5 5 5 4 4 5 8 8 8 9 >> 9 9 >> > 9 >> > >> ...] >> > >> [2] 22084156 22084276 121 [4 4 4 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 >> 4 4 >> > 4 >> > >> ...] >> > >> [3] 22083039 22083195 157 [2 1 1 0 0 0 1 1 1 1 2 2 2 2 2 2 2 2 2 >> 3 3 >> > 4 >> > >> ...] >> > >> [4] 22079485 22079612 128 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 >> 2 2 >> > 2 >> > >> ...] >> > >> [5] 22079020 22079144 125 [6 6 6 6 7 7 7 7 7 7 7 7 7 7 6 6 6 6 6 >> 6 5 >> > 4 >> > >> ...] >> > >> >> > >> I don't know how to modify this view such as adding chromosome info >> > >> and export this 'cvg.view' in rtracklayer. >> > > >> > > >> > > You can now export this RleViewsList directly to a file for upload. >> For >> > > upload to the UCSC browser, how about: >> > > >> > > session <- browserSession() >> > > sessioncoverage <- cvg.view >> > > browserView(session, exon.gr[1]) >> > > >> > > That should show you the first exon. >> > > >> > >> Thanks in advance. >> > >> Jason >> > >> >> > >> > sessionInfo() >> > >> R version 2.14.0 (2011-10-31) >> > >> Platform: x86_64-unknown-linux-gnu (64-bit) >> > >> >> > >> locale: >> > >> [1] C >> > >> >> > >> attached base packages: >> > >> [1] stats graphics grDevices utils datasets methods base >> > >> >> > >> other attached packages: >> > >> [1] rtracklayer_1.14.3 RCurl_1.7-0 bitops_1.0-4.1 >> > >> [4] Rsamtools_1.6.2 Biostrings_2.22.0 >> GenomicFeatures_1.6.4 >> > >> [7] AnnotationDbi_1.16.4 Biobase_2.14.0 GenomicRanges_1.6.3 >> > >> [10] IRanges_1.12.2 BiocInstaller_1.2.1 >> > >> >> > >> loaded via a namespace (and not attached): >> > >> [1] BSgenome_1.22.0 DBI_0.2-5 RSQLite_0.10.0 XML_3.4-3 >> > >> [5] biomaRt_2.10.0 tools_2.14.0 zlibbioc_1.0.0 >> > >> > >> > >> >> > >> _______________________________________________ >> > >> Bioconductor mailing list >> > >> Bioconductor@r-project.org >> > >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> > >> Search the archives: >> > >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > >> > > >> > >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > > -- > If people do not believe that mathematics is simple, > it is only because they do not realize how complicated life is. John von > Neumann<http: www-groups.dcs.st-="" and.ac.uk="" %7ehistory="" biographies="" von_neumann.html=""> > [[alternative HTML version deleted]]
On Wed, Nov 23, 2011 at 1:48 PM, Michael Lawrence <lawrence.michael@gene.com> wrote: > > > On Wed, Nov 23, 2011 at 11:58 AM, Tim Triche, Jr. <tim.triche@gmail.com>wrote: > >> not entirely related, but not wholly unrelated; some things in >> rtracklayer I've been wondering about: >> >> 1) I try and import a WIG file and it takes over 24 hours. This seems >> like a Bad Thing, what am I doing wrong? >> > > Does it eventually finish? How big is the WIG file? Could you point me one > to test? Also, would you please provide your sesionInfo()? > I killed the import after 36 hours and 51GB of RAM. Here's the file I was working on: ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/samples/GSM621nnn/GS M621665/GSM621665_BI.Mobilized_CD34_Primary_Cells.H3K4me3.UW_RO_01536. wig.gz Numerous other files of interest are up on the Epigenome Roadmap site: http://www.ncbi.nlm.nih.gov/geo/roadmap/epigenomics/ This seems to be where I obtain most of my woes. (My SessionInfo() is appended to the end of the email.) > >> 2) How should I import .vstep tracks? >> > > Haven't heard of these. Are these just variable step WIG? If so, just use > the WIG import. > They are, sort of, but they are supposedly BED format. Here are several: http://dir.nhlbi.nih.gov/papers/lmi/epigenomes/hghscmethylation.aspx > > >> 3) Sometimes a BED file throws an error, what is the best way to debug >> this? import.ucsc(filename, subformat='bed')? >> >> > Maybe trace(import.bed, browser, sig="connection"). > Will try this and post a reply. SessionInfo: > sessionInfo() R version 2.14.0 (2011-10-31) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] rtracklayer_1.14.2 RCurl_1.7-0 bitops_1.0-4.1 [4] gtools_2.6.2 BiocInstaller_1.2.1 reshape_0.8.4 [7] plyr_1.6 loaded via a namespace (and not attached): [1] Biostrings_2.22.0 BSgenome_1.22.0 GenomicRanges_1.6.2 [4] IRanges_1.12.1 tools_2.14.0 XML_3.2-0 [7] zlibbioc_1.0.0 > > >> and a GenomicRanges question: >> >> 1) suppose I have a large pile of segmented copy number calls. what's >> the best way to get a smoothed "pileup" of them? >> >> thanks much! for the tools and for the guidance. >> >> >> >> >> >> >> On Wed, Nov 23, 2011 at 8:38 AM, Michael Lawrence < >> lawrence.michael@gene.com> wrote: >> >>> On Wed, Nov 23, 2011 at 7:18 AM, Jason Lu <jasonlu68@gmail.com> wrote: >>> >>> > I want to thank Michael and Martin for being helpful. >>> > >>> > My apology for not showing the full script last email. The purpose >>> > here is to get a figure showing read coverage in the exon regions of >>> > my gene of interest (one gene). I tried some additional steps here and >>> > get an error below. >>> > >>> > # >>> > nm = "NM_000997" >>> > txs = exonsBy(txdb,by="tx",use.name=TRUE) >>> > toshow = txs[[nm]] >>> > ss <- >>> > readBamGappedAlignments("myalign.bam",param=ScanBamParam(which=t oshow)) >>> > strand(ss) = "*" >>> > cvg = coverage(ss) # cvg: SimpleRleList >>> > >>> > cvg.view <- Views(cvg, as(toshow, "RangesList")) >>> > Error in .Method(..., FUN = FUN, MoreArgs = MoreArgs, SIMPLIFY = >>> SIMPLIFY, >>> > : >>> > all object lengths must be multiple of longest object length >>> > >>> > >>> The problem here is likely that "toshow" is on a wider universe (set of >>> sequences) than your coverage vector from the BAM file. You can use >>> keepSeqlevels(toshow, seqlevels(ss)). You want to make sure the seqlevels >>> are in the same order; not sure if keepSeqlevels() does this -- perhaps >>> it >>> should? >>> >>> >>> > Also, I have been unsuccessful in exporting the RleViewsList to a file >>> > (wig or bedGraph format). Could you give a hint on this as well? >>> > Thanks again. >>> > >>> > >>> Now that I think about it, Views are a bit of an overkill for this >>> specific >>> case. You could just do a >>> >>> cvg.select <- seqselect(cvg, as(toshow, "RangesList")) >>> >>> And then export that: >>> >>> export(cvg.select, "coverage.bedGraph") >>> >>> Jason >>> > >>> > >>> > On Wed, Nov 23, 2011 at 8:20 AM, Michael Lawrence >>> > <lawrence.michael@gene.com> wrote: >>> > > >>> > > >>> > > On Tue, Nov 22, 2011 at 9:39 AM, Jason Lu <jasonlu68@gmail.com> >>> wrote: >>> > >> >>> > >> Hi list, >>> > >> >>> > >> I have a question and would like to get your help. >>> > >> What I would like to get is to show the read coverage in the exon >>> > >> regions (only) in the ucsc genome browser. My question is how to >>> > >> export the Views so I can load into ucsc browser. >>> > >> Here is what I have: >>> > >> >>> > >> # >>> > >> >cvg = coverage(ranges(ss)) >>> > > >>> > > Careful here: you are not distinguishing chromosomes when calculating >>> > this >>> > > coverage. It's all going into one vector. Instead, you want to call >>> > coverage >>> > > directly on your GRanges 'ss'. >>> > > >>> > > cvg <- coverage(ss) >>> > > >>> > >> >>> > >> >cvg.view = Views(cvg,rangesexon.gr)) # exon.gr is a GRanges >>> > > >>> > > Since 'cvg' is now an RleList (with one element per chromosome), you >>> > want a >>> > > RangesList to parallel it in your RleViewsList. >>> > > >>> > > cvg.view <- Views(cvg, asexon.gr, "RangesList")) >>> > > >>> > > I think we should try to make that easier and have Views() take a >>> > GRanges. >>> > > Will make a note. >>> > > >>> > >> >>> > >> >cvg.view >>> > >> Views on a 22109556-length Rle subject >>> > >> >>> > >> views: >>> > >> start end width >>> > >> [1] 22109317 22109688 372 [5 5 7 5 6 6 6 5 5 5 5 5 4 4 5 8 8 8 9 >>> 9 9 >>> > 9 >>> > >> ...] >>> > >> [2] 22084156 22084276 121 [4 4 4 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 >>> 4 4 >>> > 4 >>> > >> ...] >>> > >> [3] 22083039 22083195 157 [2 1 1 0 0 0 1 1 1 1 2 2 2 2 2 2 2 2 2 >>> 3 3 >>> > 4 >>> > >> ...] >>> > >> [4] 22079485 22079612 128 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 >>> 2 2 >>> > 2 >>> > >> ...] >>> > >> [5] 22079020 22079144 125 [6 6 6 6 7 7 7 7 7 7 7 7 7 7 6 6 6 6 6 >>> 6 5 >>> > 4 >>> > >> ...] >>> > >> >>> > >> I don't know how to modify this view such as adding chromosome info >>> > >> and export this 'cvg.view' in rtracklayer. >>> > > >>> > > >>> > > You can now export this RleViewsList directly to a file for upload. >>> For >>> > > upload to the UCSC browser, how about: >>> > > >>> > > session <- browserSession() >>> > > sessioncoverage <- cvg.view >>> > > browserView(session, exon.gr[1]) >>> > > >>> > > That should show you the first exon. >>> > > >>> > >> Thanks in advance. >>> > >> Jason >>> > >> >>> > >> > sessionInfo() >>> > >> R version 2.14.0 (2011-10-31) >>> > >> Platform: x86_64-unknown-linux-gnu (64-bit) >>> > >> >>> > >> locale: >>> > >> [1] C >>> > >> >>> > >> attached base packages: >>> > >> [1] stats graphics grDevices utils datasets methods base >>> > >> >>> > >> other attached packages: >>> > >> [1] rtracklayer_1.14.3 RCurl_1.7-0 bitops_1.0-4.1 >>> > >> [4] Rsamtools_1.6.2 Biostrings_2.22.0 >>> GenomicFeatures_1.6.4 >>> > >> [7] AnnotationDbi_1.16.4 Biobase_2.14.0 GenomicRanges_1.6.3 >>> > >> [10] IRanges_1.12.2 BiocInstaller_1.2.1 >>> > >> >>> > >> loaded via a namespace (and not attached): >>> > >> [1] BSgenome_1.22.0 DBI_0.2-5 RSQLite_0.10.0 XML_3.4-3 >>> > >> [5] biomaRt_2.10.0 tools_2.14.0 zlibbioc_1.0.0 >>> > >> > >>> > >> >>> > >> _______________________________________________ >>> > >> Bioconductor mailing list >>> > >> Bioconductor@r-project.org >>> > >> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> > >> Search the archives: >>> > >> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> > > >>> > > >>> > >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> >> >> -- >> If people do not believe that mathematics is simple, >> it is only because they do not realize how complicated life is. John von >> Neumann<http: www-groups.dcs.st-="" and.ac.uk="" %7ehistory="" biographies="" von_neumann.html=""> >> > > -- If people do not believe that mathematics is simple, it is only because they do not realize how complicated life is. John von Neumann<http: www-groups.dcs.st-="" and.ac.uk="" ~history="" biographies="" von_neumann.html=""> [[alternative HTML version deleted]] ADD REPLYlink written 7.0 years ago by Tim Triche4.2k On Wed, Nov 23, 2011 at 3:08 PM, Tim Triche, Jr. <tim.triche@gmail.com>wrote: > On Wed, Nov 23, 2011 at 1:48 PM, Michael Lawrence < > lawrence.michael@gene.com > > wrote: > > > > > > > On Wed, Nov 23, 2011 at 11:58 AM, Tim Triche, Jr. <tim.triche@gmail.com> >wrote: > > > >> not entirely related, but not wholly unrelated; some things in > >> rtracklayer I've been wondering about: > >> > >> 1) I try and import a WIG file and it takes over 24 hours. This seems > >> like a Bad Thing, what am I doing wrong? > >> > > > > Does it eventually finish? How big is the WIG file? Could you point me > one > > to test? Also, would you please provide your sesionInfo()? > > > > I killed the import after 36 hours and 51GB of RAM. Here's the file I was > working on: > > > ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/samples/GSM621nnn/ GSM621665/GSM621665_BI.Mobilized_CD34_Primary_Cells.H3K4me3.UW_RO_0153 6.wig.gz > > Numerous other files of interest are up on the Epigenome Roadmap site: > > http://www.ncbi.nlm.nih.gov/geo/roadmap/epigenomics/ > > This seems to be where I obtain most of my woes. > > (My SessionInfo() is appended to the end of the email.) > > Thanks, I will look into this. > > > > > >> 2) How should I import .vstep tracks? > >> > > > > Haven't heard of these. Are these just variable step WIG? If so, just use > > the WIG import. > > > > They are, sort of, but they are supposedly BED format. Here are several: > > http://dir.nhlbi.nih.gov/papers/lmi/epigenomes/hghscmethylation.aspx > > > I looked at them. They are WIG. > > > > > > >> 3) Sometimes a BED file throws an error, what is the best way to debug > >> this? import.ucsc(filename, subformat='bed')? > >> > >> > > Maybe trace(import.bed, browser, sig="connection"). > > > > Will try this and post a reply. > > SessionInfo: > > > > sessionInfo() > R version 2.14.0 (2011-10-31) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices datasets utils methods base > > other attached packages: > [1] rtracklayer_1.14.2 RCurl_1.7-0 bitops_1.0-4.1 > [4] gtools_2.6.2 BiocInstaller_1.2.1 reshape_0.8.4 > [7] plyr_1.6 > > loaded via a namespace (and not attached): > [1] Biostrings_2.22.0 BSgenome_1.22.0 GenomicRanges_1.6.2 > [4] IRanges_1.12.1 tools_2.14.0 XML_3.2-0 > [7] zlibbioc_1.0.0 > > > > > > > > > > > >> and a GenomicRanges question: > >> > >> 1) suppose I have a large pile of segmented copy number calls. what's > >> the best way to get a smoothed "pileup" of them? > >> > >> thanks much! for the tools and for the guidance. > >> > >> > >> > >> > >> > >> > >> On Wed, Nov 23, 2011 at 8:38 AM, Michael Lawrence < > >> lawrence.michael@gene.com> wrote: > >> > >>> On Wed, Nov 23, 2011 at 7:18 AM, Jason Lu <jasonlu68@gmail.com> wrote: > >>> > >>> > I want to thank Michael and Martin for being helpful. > >>> > > >>> > My apology for not showing the full script last email. The purpose > >>> > here is to get a figure showing read coverage in the exon regions of > >>> > my gene of interest (one gene). I tried some additional steps here > and > >>> > get an error below. > >>> > > >>> > # > >>> > nm = "NM_000997" > >>> > txs = exonsBy(txdb,by="tx",use.name=TRUE) > >>> > toshow = txs[[nm]] > >>> > ss <- > >>> > > readBamGappedAlignments("myalign.bam",param=ScanBamParam(which=tosho w)) > >>> > strand(ss) = "*" > >>> > cvg = coverage(ss) # cvg: SimpleRleList > >>> > > >>> > cvg.view <- Views(cvg, as(toshow, "RangesList")) > >>> > Error in .Method(..., FUN = FUN, MoreArgs = MoreArgs, SIMPLIFY = > >>> SIMPLIFY, > >>> > : > >>> > all object lengths must be multiple of longest object length > >>> > > >>> > > >>> The problem here is likely that "toshow" is on a wider universe (set of > >>> sequences) than your coverage vector from the BAM file. You can use > >>> keepSeqlevels(toshow, seqlevels(ss)). You want to make sure the > seqlevels > >>> are in the same order; not sure if keepSeqlevels() does this -- perhaps > >>> it > >>> should? > >>> > >>> > >>> > Also, I have been unsuccessful in exporting the RleViewsList to a > file > >>> > (wig or bedGraph format). Could you give a hint on this as well? > >>> > Thanks again. > >>> > > >>> > > >>> Now that I think about it, Views are a bit of an overkill for this > >>> specific > >>> case. You could just do a > >>> > >>> cvg.select <- seqselect(cvg, as(toshow, "RangesList")) > >>> > >>> And then export that: > >>> > >>> export(cvg.select, "coverage.bedGraph") > >>> > >>> Jason > >>> > > >>> > > >>> > On Wed, Nov 23, 2011 at 8:20 AM, Michael Lawrence > >>> > <lawrence.michael@gene.com> wrote: > >>> > > > >>> > > > >>> > > On Tue, Nov 22, 2011 at 9:39 AM, Jason Lu <jasonlu68@gmail.com> > >>> wrote: > >>> > >> > >>> > >> Hi list, > >>> > >> > >>> > >> I have a question and would like to get your help. > >>> > >> What I would like to get is to show the read coverage in the exon > >>> > >> regions (only) in the ucsc genome browser. My question is how to > >>> > >> export the Views so I can load into ucsc browser. > >>> > >> Here is what I have: > >>> > >> > >>> > >> # > >>> > >> >cvg = coverage(ranges(ss)) > >>> > > > >>> > > Careful here: you are not distinguishing chromosomes when > calculating > >>> > this > >>> > > coverage. It's all going into one vector. Instead, you want to call > >>> > coverage > >>> > > directly on your GRanges 'ss'. > >>> > > > >>> > > cvg <- coverage(ss) > >>> > > > >>> > >> > >>> > >> >cvg.view = Views(cvg,rangesexon.gr)) # exon.gr is a GRanges > >>> > > > >>> > > Since 'cvg' is now an RleList (with one element per chromosome), > you > >>> > want a > >>> > > RangesList to parallel it in your RleViewsList. > >>> > > > >>> > > cvg.view <- Views(cvg, asexon.gr, "RangesList")) > >>> > > > >>> > > I think we should try to make that easier and have Views() take a > >>> > GRanges. > >>> > > Will make a note. > >>> > > > >>> > >> > >>> > >> >cvg.view > >>> > >> Views on a 22109556-length Rle subject > >>> > >> > >>> > >> views: > >>> > >> start end width > >>> > >> [1] 22109317 22109688 372 [5 5 7 5 6 6 6 5 5 5 5 5 4 4 5 8 8 8 > 9 > >>> 9 9 > >>> > 9 > >>> > >> ...] > >>> > >> [2] 22084156 22084276 121 [4 4 4 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 > 4 > >>> 4 4 > >>> > 4 > >>> > >> ...] > >>> > >> [3] 22083039 22083195 157 [2 1 1 0 0 0 1 1 1 1 2 2 2 2 2 2 2 2 > 2 > >>> 3 3 > >>> > 4 > >>> > >> ...] > >>> > >> [4] 22079485 22079612 128 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 > 2 > >>> 2 2 > >>> > 2 > >>> > >> ...] > >>> > >> [5] 22079020 22079144 125 [6 6 6 6 7 7 7 7 7 7 7 7 7 7 6 6 6 6 > 6 > >>> 6 5 > >>> > 4 > >>> > >> ...] > >>> > >> > >>> > >> I don't know how to modify this view such as adding chromosome > info > >>> > >> and export this 'cvg.view' in rtracklayer. > >>> > > > >>> > > > >>> > > You can now export this RleViewsList directly to a file for upload. > >>> For > >>> > > upload to the UCSC browser, how about: > >>> > > > >>> > > session <- browserSession() > >>> > > sessioncoverage <- cvg.view > >>> > > browserView(session, exon.gr[1]) > >>> > > > >>> > > That should show you the first exon. > >>> > > > >>> > >> Thanks in advance. > >>> > >> Jason > >>> > >> > >>> > >> > sessionInfo() > >>> > >> R version 2.14.0 (2011-10-31) > >>> > >> Platform: x86_64-unknown-linux-gnu (64-bit) > >>> > >> > >>> > >> locale: > >>> > >> [1] C > >>> > >> > >>> > >> attached base packages: > >>> > >> [1] stats graphics grDevices utils datasets methods > base > >>> > >> > >>> > >> other attached packages: > >>> > >> [1] rtracklayer_1.14.3 RCurl_1.7-0 bitops_1.0-4.1 > >>> > >> [4] Rsamtools_1.6.2 Biostrings_2.22.0 > >>> GenomicFeatures_1.6.4 > >>> > >> [7] AnnotationDbi_1.16.4 Biobase_2.14.0 > GenomicRanges_1.6.3 > >>> > >> [10] IRanges_1.12.2 BiocInstaller_1.2.1 > >>> > >> > >>> > >> loaded via a namespace (and not attached): > >>> > >> [1] BSgenome_1.22.0 DBI_0.2-5 RSQLite_0.10.0 XML_3.4-3 > >>> > >> [5] biomaRt_2.10.0 tools_2.14.0 zlibbioc_1.0.0 > >>> > >> > > >>> > >> > >>> > >> _______________________________________________ > >>> > >> Bioconductor mailing list > >>> > >> Bioconductor@r-project.org > >>> > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>> > >> Search the archives: > >>> > >> http://news.gmane.org/gmane.science.biology.informatics.conductor > >>> > > > >>> > > > >>> > > >>> > >>> [[alternative HTML version deleted]] > >>> > >>> _______________________________________________ > >>> Bioconductor mailing list > >>> Bioconductor@r-project.org > >>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>> Search the archives: > >>> http://news.gmane.org/gmane.science.biology.informatics.conductor > >>> > >> > >> > >> > >> -- > >> If people do not believe that mathematics is simple, > >> it is only because they do not realize how complicated life is. John von > >> Neumann< > http://www-groups.dcs.st- and.ac.uk/%7Ehistory/Biographies/Von_Neumann.html > > > >> > > > > > > > -- > If people do not believe that mathematics is simple, > it is only because they do not realize how complicated life is. John von > Neumann< > http://www-groups.dcs.st- and.ac.uk/~history/Biographies/Von_Neumann.html> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]