more stupid *Ranges questions...
1
0
Entering edit mode
Tim Triche ★ 4.2k
@tim-triche-3561
Last seen 20 months ago
United States
I have a GenomicRanges object built from interrogated sites and a RangedData object of human (allegedly canonical) transcription start sites, from Julie Zhu's ChIPpeakAnno package. I want to walk up and down each chromosome and find the nearest forward and reverse strand TSS and their distance from each site. This seems like it would work: > nearest(cpgranges, TSS.human.GRCh37) But one of the objects isn't the right type: Error in function (classes, fdef, mtable) : unable to find an inherited method for function "nearest", for signature "GRanges", "RangedData" What's the right way to solve this problem? I know about follow() and precede(), but those won't work either until I solve this :-) thanks! -- If people do not believe that mathematics is simple, it is only because they do not realize how complicated life is. John von Neumann<http: www-groups.dcs.st-="" and.ac.uk="" ~history="" biographies="" von_neumann.html=""> [[alternative HTML version deleted]]
0
Entering edit mode
@michael-lawrence-3846
Last seen 5 months ago
United States
Easiest path is to convert the RangedData to a GRanges: as(TSS.human.GRCh37, "GRanges") I might recommend though to get the TSS's from GenomicFeatures::transcripts. Michael On Thu, Sep 15, 2011 at 2:28 PM, Tim Triche, Jr. <tim.triche@gmail.com>wrote: > I have a GenomicRanges object built from interrogated sites and a > RangedData > object of human (allegedly canonical) transcription start sites, from Julie > Zhu's ChIPpeakAnno package. I want to walk up and down each chromosome and > find the nearest forward and reverse strand TSS and their distance from > each > site. This seems like it would work: > > > nearest(cpgranges, TSS.human.GRCh37) > > But one of the objects isn't the right type: > > Error in function (classes, fdef, mtable) : > unable to find an inherited method for function "nearest", for signature > "GRanges", "RangedData" > > What's the right way to solve this problem? I know about follow() and > precede(), but those won't work either until I solve this :-) > > thanks! > > > > -- > If people do not believe that mathematics is simple, it is only because > they > do not realize how complicated life is. > John von Neumann< > http://www-groups.dcs.st- and.ac.uk/~history/Biographies/Von_Neumann.html> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
0
Entering edit mode
On 09/15/2011 03:02 PM, Michael Lawrence wrote: > Easiest path is to convert the RangedData to a GRanges: FWIW, I used showMethods(nearest) to see what methods were defined (things need to be or extend Ranges, or GenomicRanges, but not both), then looked for ways to coerce between the types in hand. showMethods(coerce, class=c("GRanges", "RangedData")) admittedly requiring a little too much understanding about the class hierarchy. Martin > > as(TSS.human.GRCh37, "GRanges") > > I might recommend though to get the TSS's from GenomicFeatures::transcripts. > > Michael > > On Thu, Sep 15, 2011 at 2:28 PM, Tim Triche, Jr.<tim.triche at="" gmail.com="">wrote: > >> I have a GenomicRanges object built from interrogated sites and a >> RangedData >> object of human (allegedly canonical) transcription start sites, from Julie >> Zhu's ChIPpeakAnno package. I want to walk up and down each chromosome and >> find the nearest forward and reverse strand TSS and their distance from >> each >> site. This seems like it would work: >> >>> nearest(cpgranges, TSS.human.GRCh37) >> >> But one of the objects isn't the right type: >> >> Error in function (classes, fdef, mtable) : >> unable to find an inherited method for function "nearest", for signature >> "GRanges", "RangedData" >> >> What's the right way to solve this problem? I know about follow() and >> precede(), but those won't work either until I solve this :-) >> >> thanks! >> >> >> >> -- >> If people do not believe that mathematics is simple, it is only because >> they >> do not realize how complicated life is. >> John von Neumann< >> http://www-groups.dcs.st- and.ac.uk/~history/Biographies/Von_Neumann.html> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793
0
Entering edit mode
OK, so I took your advice and used transcripts(TxDb.Hsapiens.UCSC.hg19.knownGene::Hsapiens_UCSC_hg19_know nGene_TxDb) and indeed that is quite handy (got all my TSSes forward and reverse for all my probes in seconds, yay!). Now the question is, how do I use the associated EntrezGene IDs? e.g. the trusty eg.Hs.org.db says... > elementMetadata(foo)$tx_name[1] [1] "uc001aaa.3" > org.Hs.egSYMBOL[[ elementMetadata(foo)$tx_name[1] ]] NULL Two steps forward, one step back.... eventually I will cram all of this into a genoset, though... :-) thanks! --t On Thu, Sep 15, 2011 at 3:02 PM, Michael Lawrence <lawrence.michael@gene.com> wrote: > Easiest path is to convert the RangedData to a GRanges: > > as(TSS.human.GRCh37, "GRanges") > > I might recommend though to get the TSS's from > GenomicFeatures::transcripts. > > Michael > > On Thu, Sep 15, 2011 at 2:28 PM, Tim Triche, Jr. <tim.triche@gmail.com>wrote: > >> I have a GenomicRanges object built from interrogated sites and a >> RangedData >> object of human (allegedly canonical) transcription start sites, from >> Julie >> Zhu's ChIPpeakAnno package. I want to walk up and down each chromosome >> and >> find the nearest forward and reverse strand TSS and their distance from >> each >> site. This seems like it would work: >> >> > nearest(cpgranges, TSS.human.GRCh37) >> >> But one of the objects isn't the right type: >> >> Error in function (classes, fdef, mtable) : >> unable to find an inherited method for function "nearest", for signature >> "GRanges", "RangedData" >> >> What's the right way to solve this problem? I know about follow() and >> precede(), but those won't work either until I solve this :-) >> >> thanks! >> >> >> >> -- >> If people do not believe that mathematics is simple, it is only because >> they >> do not realize how complicated life is. >> John von Neumann< >> http://www-groups.dcs.st- and.ac.uk/~history/Biographies/Von_Neumann.html> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > -- When you emerge in a few years, you can ask someone what you missed, and you'll find it can be summed up in a few minutes. Derek Sivers <http: sivers.org="" berklee=""> [[alternative HTML version deleted]]
0
Entering edit mode
On Fri, Sep 16, 2011 at 11:28 PM, Tim Triche, Jr. <ttriche@usc.edu> wrote: > OK, so I took your advice and used > > > transcripts(TxDb.Hsapiens.UCSC.hg19.knownGene::Hsapiens_UCSC_hg19_kn ownGene_TxDb) > > and indeed that is quite handy (got all my TSSes forward and reverse for > all > my probes in seconds, yay!). Now the question is, how do I use the > associated EntrezGene IDs? e.g. the trusty eg.Hs.org.db says... > > > elementMetadata(foo)$tx_name[1] > [1] "uc001aaa.3" > These are "UCSC knownGene" tokens. You would use, e.g., get("uc001aaa.3", revmap(org.Hs.egUCSCKG)) to get the entrez gene ID. But in org.Hs.eg.db 2.5.0 this particular token is not mapped. The symbol as of 2009 was evidently DDX11L1. > > org.Hs.egSYMBOL[[ elementMetadata(foo)$tx_name[1] ]] > NULL > > Two steps forward, one step back.... eventually I will cram all of this > into > a genoset, though... :-) > > thanks! > > --t > > > > On Thu, Sep 15, 2011 at 3:02 PM, Michael Lawrence < > lawrence.michael@gene.com > > wrote: > > > Easiest path is to convert the RangedData to a GRanges: > > > > as(TSS.human.GRCh37, "GRanges") > > > > I might recommend though to get the TSS's from > > GenomicFeatures::transcripts. > > > > Michael > > > > On Thu, Sep 15, 2011 at 2:28 PM, Tim Triche, Jr. <tim.triche@gmail.com> >wrote: > > > >> I have a GenomicRanges object built from interrogated sites and a > >> RangedData > >> object of human (allegedly canonical) transcription start sites, from > >> Julie > >> Zhu's ChIPpeakAnno package. I want to walk up and down each chromosome > >> and > >> find the nearest forward and reverse strand TSS and their distance from > >> each > >> site. This seems like it would work: > >> > >> > nearest(cpgranges, TSS.human.GRCh37) > >> > >> But one of the objects isn't the right type: > >> > >> Error in function (classes, fdef, mtable) : > >> unable to find an inherited method for function "nearest", for > signature > >> "GRanges", "RangedData" > >> > >> What's the right way to solve this problem? I know about follow() and > >> precede(), but those won't work either until I solve this :-) > >> > >> thanks! > >> > >> > >> > >> -- > >> If people do not believe that mathematics is simple, it is only because > >> they > >> do not realize how complicated life is. > >> John von Neumann< > >> > http://www-groups.dcs.st- and.ac.uk/~history/Biographies/Von_Neumann.html> > >> > >> [[alternative HTML version deleted]] > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor@r-project.org > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: > >> http://news.gmane.org/gmane.science.biology.informatics.conductor > >> > > > > > > > -- > When you emerge in a few years, you can ask someone what you missed, and > you'll find it can be summed up in a few minutes. > > Derek Sivers <http: sivers.org="" berklee=""> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
0
Entering edit mode
On Fri, Sep 16, 2011 at 8:28 PM, Tim Triche, Jr. <ttriche@usc.edu> wrote: > OK, so I took your advice and used > > > transcripts(TxDb.Hsapiens.UCSC.hg19.knownGene::Hsapiens_UCSC_hg19_kn ownGene_TxDb) > > and indeed that is quite handy (got all my TSSes forward and reverse for > all my probes in seconds, yay!). Now the question is, how do I use the > associated EntrezGene IDs? e.g. the trusty eg.Hs.org.db says... > > > elementMetadata(foo)$tx_name[1] > [1] "uc001aaa.3" > > org.Hs.egSYMBOL[[ elementMetadata(foo)$tx_name[1] ]] > NULL > > Two steps forward, one step back.... eventually I will cram all of this > into a genoset, though... :-) > > You do not need to use the org.* packages for this. Just ask transcripts() for the gene_id column. See ?transcripts. Michael thanks! > > --t > > > > On Thu, Sep 15, 2011 at 3:02 PM, Michael Lawrence < > lawrence.michael@gene.com> wrote: > >> Easiest path is to convert the RangedData to a GRanges: >> >> as(TSS.human.GRCh37, "GRanges") >> >> I might recommend though to get the TSS's from >> GenomicFeatures::transcripts. >> >> Michael >> >> On Thu, Sep 15, 2011 at 2:28 PM, Tim Triche, Jr. <tim.triche@gmail.com>wrote: >> >>> I have a GenomicRanges object built from interrogated sites and a >>> RangedData >>> object of human (allegedly canonical) transcription start sites, from >>> Julie >>> Zhu's ChIPpeakAnno package. I want to walk up and down each chromosome >>> and >>> find the nearest forward and reverse strand TSS and their distance from >>> each >>> site. This seems like it would work: >>> >>> > nearest(cpgranges, TSS.human.GRCh37) >>> >>> But one of the objects isn't the right type: >>> >>> Error in function (classes, fdef, mtable) : >>> unable to find an inherited method for function "nearest", for signature >>> "GRanges", "RangedData" >>> >>> What's the right way to solve this problem? I know about follow() and >>> precede(), but those won't work either until I solve this :-) >>> >>> thanks! >>> >>> >>> >>> -- >>> If people do not believe that mathematics is simple, it is only because >>> they >>> do not realize how complicated life is. >>> John von Neumann< >>> http://www-groups.dcs.st- and.ac.uk/~history/Biographies/Von_Neumann.html >>> > >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> > > > -- > When you emerge in a few years, you can ask someone what you missed, and > you'll find it can be summed up in a few minutes. > > Derek Sivers <http: sivers.org="" berklee=""> > > [[alternative HTML version deleted]]
0
Entering edit mode
Yeah, I realized that on Friday and forgot to post it. One question, though -- some of the transcripts aren't annotated to a gene (I would also be happy with putative or confirmed ncRNAs, miRNAs, etc, even more so if I could keep them separate -- is there something like a "knownNcRna" track or table outside of UCSC that I should look into for this purpose?). Should I just throw out all the transcripts without an EntrezGene ID for the time being, then circle back and revisit this when I find the appropriate resource for non-coding but annotated transcripts? It seems odd that a table of KnownGene transcripts would lack gene IDs for some of the transcripts. Thanks again for a very useful package, --t On Sep 18, 2011, at 10:33 AM, Michael Lawrence <lawrence.michael@gene.com> wrote: > > > On Fri, Sep 16, 2011 at 8:28 PM, Tim Triche, Jr. <ttriche@usc.edu> wrote: > OK, so I took your advice and used > > transcripts(TxDb.Hsapiens.UCSC.hg19.knownGene::Hsapiens_UCSC_hg19_kn ownGene_TxDb) > > and indeed that is quite handy (got all my TSSes forward and reverse for all my probes in seconds, yay!). Now the question is, how do I use the associated EntrezGene IDs? e.g. the trusty eg.Hs.org.db says... > > > elementMetadata(foo)$tx_name[1] > [1] "uc001aaa.3" > > org.Hs.egSYMBOL[[ elementMetadata(foo)$tx_name[1] ]] > NULL > > Two steps forward, one step back.... eventually I will cram all of this into a genoset, though... :-) > > > You do not need to use the org.* packages for this. Just ask transcripts() for the gene_id column. See ?transcripts. > > Michael > > thanks! > > --t > > > > On Thu, Sep 15, 2011 at 3:02 PM, Michael Lawrence <lawrence.michael@gene.com> wrote: > Easiest path is to convert the RangedData to a GRanges: > > as(TSS.human.GRCh37, "GRanges") > > I might recommend though to get the TSS's from GenomicFeatures::transcripts. > > Michael > > On Thu, Sep 15, 2011 at 2:28 PM, Tim Triche, Jr. <tim.triche@gmail.com> wrote: > I have a GenomicRanges object built from interrogated sites and a RangedData > object of human (allegedly canonical) transcription start sites, from Julie > Zhu's ChIPpeakAnno package. I want to walk up and down each chromosome and > find the nearest forward and reverse strand TSS and their distance from each > site. This seems like it would work: > > > nearest(cpgranges, TSS.human.GRCh37) > > But one of the objects isn't the right type: > > Error in function (classes, fdef, mtable) : > unable to find an inherited method for function "nearest", for signature > "GRanges", "RangedData" > > What's the right way to solve this problem? I know about follow() and > precede(), but those won't work either until I solve this :-) > > thanks! > > > > -- > If people do not believe that mathematics is simple, it is only because they > do not realize how complicated life is. > John von Neumann<http: www-groups.dcs.st-="" and.ac.uk="" ~history="" biographies="" von_neumann.html=""> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > -- > When you emerge in a few years, you can ask someone what you missed, and you'll find it can be summed up in a few minutes. > > Derek Sivers > > [[alternative HTML version deleted]]
0
Entering edit mode
Hi Tim, See this email from the UCSC people for why not all UCSC Genes IDs are mapped to an Entrez Gene ID: https://lists.soe.ucsc.edu/pipermail/genome/2011-April/025784.html HTH, H. On 11-09-18 10:52 AM, Tim Triche, Jr. wrote: > Yeah, I realized that on Friday and forgot to post it. One question, though -- some of the transcripts aren't annotated to a gene (I would also be happy with putative or confirmed ncRNAs, miRNAs, etc, even more so if I could keep them separate -- is there something like a "knownNcRna" track or table outside of UCSC that I should look into for this purpose?). > > Should I just throw out all the transcripts without an EntrezGene ID for the time being, then circle back and revisit this when I find the appropriate resource for non-coding but annotated transcripts? > > It seems odd that a table of KnownGene transcripts would lack gene IDs for some of the transcripts. > > Thanks again for a very useful package, > > --t > > On Sep 18, 2011, at 10:33 AM, Michael Lawrence<lawrence.michael at="" gene.com=""> wrote: > >> >> >> On Fri, Sep 16, 2011 at 8:28 PM, Tim Triche, Jr.<ttriche at="" usc.edu=""> wrote: >> OK, so I took your advice and used >> >> transcripts(TxDb.Hsapiens.UCSC.hg19.knownGene::Hsapiens_UCSC_hg19_k nownGene_TxDb) >> >> and indeed that is quite handy (got all my TSSes forward and reverse for all my probes in seconds, yay!). Now the question is, how do I use the associated EntrezGene IDs? e.g. the trusty eg.Hs.org.db says... >> >>> elementMetadata(foo)$tx_name[1] >> [1] "uc001aaa.3" >>> org.Hs.egSYMBOL[[ elementMetadata(foo)$tx_name[1] ]] >> NULL >> >> Two steps forward, one step back.... eventually I will cram all of this into a genoset, though... :-) >> >> >> You do not need to use the org.* packages for this. Just ask transcripts() for the gene_id column. See ?transcripts. >> >> Michael >> >> thanks! >> >> --t >> >> >> >> On Thu, Sep 15, 2011 at 3:02 PM, Michael Lawrence<lawrence.michael at="" gene.com=""> wrote: >> Easiest path is to convert the RangedData to a GRanges: >> >> as(TSS.human.GRCh37, "GRanges") >> >> I might recommend though to get the TSS's from GenomicFeatures::transcripts. >> >> Michael >> >> On Thu, Sep 15, 2011 at 2:28 PM, Tim Triche, Jr.<tim.triche at="" gmail.com=""> wrote: >> I have a GenomicRanges object built from interrogated sites and a RangedData >> object of human (allegedly canonical) transcription start sites, from Julie >> Zhu's ChIPpeakAnno package. I want to walk up and down each chromosome and >> find the nearest forward and reverse strand TSS and their distance from each >> site. This seems like it would work: >> >>> nearest(cpgranges, TSS.human.GRCh37) >> >> But one of the objects isn't the right type: >> >> Error in function (classes, fdef, mtable) : >> unable to find an inherited method for function "nearest", for signature >> "GRanges", "RangedData" >> >> What's the right way to solve this problem? I know about follow() and >> precede(), but those won't work either until I solve this :-) >> >> thanks! >> >> >> >> -- >> If people do not believe that mathematics is simple, it is only because they >> do not realize how complicated life is. >> John von Neumann<http: www-groups.dcs.st-="" and.ac.uk="" ~history="" biographies="" von_neumann.html=""> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> >> >> -- >> When you emerge in a few years, you can ask someone what you missed, and you'll find it can be summed up in a few minutes. >> >> Derek Sivers >> >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
0
Entering edit mode
You're awesome. I tried adapting some of your code to scan hg19 for expected (i.e., C*G) probabilities and observed CG probabilities. Let me know if I could do this more efficiently (i.e. seconds rather than minutes). I'm still pretty happy with it -- takes a few minutes on my laptop, but I bet some sort of not-very-secret BSapply() that I haven't discovered yet could make it fly. library(multicore) library(BSgenome.Hsapiens.UCSC.hg19) # library(SNPlocs.Hsapiens.dbSNP.20110815) # for picking off [CG] SNPs library(IlluminaHumanMethylation450kprobe) # will upload & release ASAP probe.oecg.by.chrom = mclapply(levels(IlluminaHumanMethylation450kprobe$CHR), function(i) { # a one-chromosome-per-core setup chrname = paste('chr', i, sep='') chr = Hsapiens[[chrname]] chrprobes = which(IlluminaHumanMethylation450kprobe$CHR == i) probecpgs = with(IlluminaHumanMethylation450kprobe[chrprobes,], IRanges(start=MAPINFO, end=MAPINFO, names=Probe_ID)) cpgwindows = resize(probecpgs, fix="center", width=500) chr.seqs = Views(chr, cpgwindows) ocg = sapply(chr.seqs, function(v) { dinucleotideFrequency(v,as.prob=T) })['CG',] c.g = sapply(chr.seqs, function(v) { alphabetFrequency(v,as.prob=T,baseOnly=T) }) ecg = c.g['C',] * c.g['G',] ocg/ecg }) The plot of the CPUs on my laptop was fairly amusing to see. multicore seems to allocate it well. 2011/9/19 Hervé Pagès <hpages@fhcrc.org> > Hi Tim, > > See this email from the UCSC people for why not all UCSC Genes IDs > are mapped to an Entrez Gene ID: > > https://lists.soe.ucsc.edu/**pipermail/genome/2011-April/**025784.h tml<https: lists.soe.ucsc.edu="" pipermail="" genome="" 2011-april="" 025784.html=""> > > HTH, > H. > > > > On 11-09-18 10:52 AM, Tim Triche, Jr. wrote: > >> Yeah, I realized that on Friday and forgot to post it. One question, >> though -- some of the transcripts aren't annotated to a gene (I would also >> be happy with putative or confirmed ncRNAs, miRNAs, etc, even more so if I >> could keep them separate -- is there something like a "knownNcRna" track or >> table outside of UCSC that I should look into for this purpose?). >> >> Should I just throw out all the transcripts without an EntrezGene ID for >> the time being, then circle back and revisit this when I find the >> appropriate resource for non-coding but annotated transcripts? >> >> It seems odd that a table of KnownGene transcripts would lack gene IDs for >> some of the transcripts. >> >> Thanks again for a very useful package, >> >> --t >> >> On Sep 18, 2011, at 10:33 AM, Michael Lawrence<lawrence.michael@**>> gene.com <lawrence.michael@gene.com>> wrote: >> >> >>> >>> On Fri, Sep 16, 2011 at 8:28 PM, Tim Triche, Jr.<ttriche@usc.edu> >>> wrote: >>> OK, so I took your advice and used >>> >>> transcripts(TxDb.Hsapiens.**UCSC.hg19.knownGene::Hsapiens_** >>> UCSC_hg19_knownGene_TxDb) >>> >>> and indeed that is quite handy (got all my TSSes forward and reverse for >>> all my probes in seconds, yay!). Now the question is, how do I use the >>> associated EntrezGene IDs? e.g. the trusty eg.Hs.org.db says... >>> >>> elementMetadata(foo)$tx_name[**1] >>>> >>> [1] "uc001aaa.3" >>> >>>> org.Hs.egSYMBOL[[ elementMetadata(foo)$tx_name[**1] ]] >>>> >>> NULL >>> >>> Two steps forward, one step back.... eventually I will cram all of this >>> into a genoset, though... :-) >>> >>> >>> You do not need to use the org.* packages for this. Just ask >>> transcripts() for the gene_id column. See ?transcripts. >>> >>> Michael >>> >>> thanks! >>> >>> --t >>> >>> >>> >>> On Thu, Sep 15, 2011 at 3:02 PM, Michael Lawrence<lawrence.michael@**>>> gene.com <lawrence.michael@gene.com>> wrote: >>> Easiest path is to convert the RangedData to a GRanges: >>> >>> as(TSS.human.GRCh37, "GRanges") >>> >>> I might recommend though to get the TSS's from >>> GenomicFeatures::transcripts. >>> >>> Michael >>> >>> On Thu, Sep 15, 2011 at 2:28 PM, Tim Triche, Jr.<tim.triche@gmail.com> >>> wrote: >>> I have a GenomicRanges object built from interrogated sites and a >>> RangedData >>> object of human (allegedly canonical) transcription start sites, from >>> Julie >>> Zhu's ChIPpeakAnno package. I want to walk up and down each chromosome >>> and >>> find the nearest forward and reverse strand TSS and their distance from >>> each >>> site. This seems like it would work: >>> >>> nearest(cpgranges, TSS.human.GRCh37) >>>> >>> >>> But one of the objects isn't the right type: >>> >>> Error in function (classes, fdef, mtable) : >>> unable to find an inherited method for function "nearest", for signature >>> "GRanges", "RangedData" >>> >>> What's the right way to solve this problem? I know about follow() and >>> precede(), but those won't work either until I solve this :-) >>> >>> thanks! >>> >>> >>> >>> -- >>> If people do not believe that mathematics is simple, it is only because >>> they >>> do not realize how complicated life is. >>> John von Neumann<http: www-groups.dcs.**st-and.ac.uk="" ~history="" **="">>> Biographies/Von_Neumann.html<http: www-groups.dcs.st-="" and.ac.uk="" ~history="" biographies="" von_neumann.html=""> >>> > >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________**_________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.="" ethz.ch="" mailman="" listinfo="" bioconductor=""> >>> Search the archives: http://news.gmane.org/gmane.** >>> science.biology.informatics.**conductor<http: news.gmane.org="" gman="" e.science.biology.informatics.conductor=""> >>> >>> >>> >>> >>> -- >>> When you emerge in a few years, you can ask someone what you missed, and >>> you'll find it can be summed up in a few minutes. >>> >>> Derek Sivers >>> >>> >>> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> > > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages@fhcrc.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > -- If people do not believe that mathematics is simple, it is only because they do not realize how complicated life is. John von Neumann<http: www-groups.dcs.st-="" and.ac.uk="" ~history="" biographies="" von_neumann.html=""> [[alternative HTML version deleted]]
0
Entering edit mode
Hi Tim, One way I think you can speed up your code is to nuke the sapply's since the alphabetFrequency and dinucleotideFrequency are "vectorized" over views, code below: 2011/9/20 Tim Triche, Jr. <tim.triche at="" gmail.com="">: > You're awesome. ?I tried adapting some of your code to scan hg19 for > expected (i.e., C*G) probabilities and observed CG probabilities. ?Let me > know if I could do this more efficiently (i.e. seconds rather than minutes). > ?I'm still pretty happy with it -- takes a few minutes on my laptop, but I > bet some sort of not-very-secret BSapply() that I haven't discovered yet > could make it fly. > > library(multicore) > library(BSgenome.Hsapiens.UCSC.hg19) > # library(SNPlocs.Hsapiens.dbSNP.20110815) # for picking off [CG] SNPs > library(IlluminaHumanMethylation450kprobe) # will upload & release ASAP > > probe.oecg.by.chrom = > mclapply(levels(IlluminaHumanMethylation450kprobe$CHR), > function(i) { > ?# a one-chromosome-per-core setup > ?chrname = paste('chr', i, sep='') > ?chr = Hsapiens[[chrname]] > ?chrprobes = which(IlluminaHumanMethylation450kprobe$CHR == i) > ?probecpgs = with(IlluminaHumanMethylation450kprobe[chrprobes,], > ? ? ? ? ? ? ? ? ? IRanges(start=MAPINFO, end=MAPINFO, names=Probe_ID)) > ?cpgwindows = resize(probecpgs, fix="center", width=500) > ?chr.seqs = Views(chr, cpgwindows) > ?ocg = sapply(chr.seqs, function(v) { > ? ?dinucleotideFrequency(v,as.prob=T) > ?})['CG',] > ?c.g = sapply(chr.seqs, function(v) { > ? ?alphabetFrequency(v,as.prob=T,baseOnly=T) > ?}) > ?ecg = c.g['C',] * c.g['G',] > ?ocg/ecg > }) I just took 15000 random intervals on chr10 as the IRanges to construct cpgwindows and created chr.seqs from these. Consider: R> system.time(ocg <- sapply(chr.seqs, function(v) { dinucleotideFrequency(v, as.prob=TRUE) })['CG',]) user system elapsed 18.583 0.053 19.13 vs. R> system.time(ocg2 <- dinucleotideFrequency(chr.seqs, as.prob=TRUE)[,'CG']) user system elapsed 0.033 0.000 0.033 R> all.equal(ocg, ocg2) [1] TRUE You can do the same trick with your c.g calc, eg. from this: R> c.g = sapply(chr.seqs, function(v) { ? ?alphabetFrequency(v,as.prob=T,baseOnly=T) ?}) to this: c.g <- alphabetFrequency(chr.seqs, as.prob=TRUE, baseOnly=TRUE) and you should see a nice boost in speed. HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
0
Entering edit mode
Forgot to post this to the list, but I updated the code on GitHub; it's faster, a lot faster. I think the time went from 10 minutes to 45sec. thanks! On Tue, Sep 20, 2011 at 6:22 AM, Steve Lianoglou < mailinglist.honeypot@gmail.com> wrote: > Hi Tim, > > One way I think you can speed up your code is to nuke the sapply's > since the alphabetFrequency and dinucleotideFrequency are "vectorized" > over views, code below: > > 2011/9/20 Tim Triche, Jr. <tim.triche@gmail.com>: > > You're awesome. I tried adapting some of your code to scan hg19 for > > expected (i.e., C*G) probabilities and observed CG probabilities. Let me > > know if I could do this more efficiently (i.e. seconds rather than > minutes). > > I'm still pretty happy with it -- takes a few minutes on my laptop, but > I > > bet some sort of not-very-secret BSapply() that I haven't discovered yet > > could make it fly. > > > > library(multicore) > > library(BSgenome.Hsapiens.UCSC.hg19) > > # library(SNPlocs.Hsapiens.dbSNP.20110815) # for picking off [CG] SNPs > > library(IlluminaHumanMethylation450kprobe) # will upload & release ASAP > > > > probe.oecg.by.chrom = > > mclapply(levels(IlluminaHumanMethylation450kprobe$CHR), > > function(i) { > > # a one-chromosome-per-core setup > > chrname = paste('chr', i, sep='') > > chr = Hsapiens[[chrname]] > > chrprobes = which(IlluminaHumanMethylation450kprobe$CHR == i) > > probecpgs = with(IlluminaHumanMethylation450kprobe[chrprobes,], > > IRanges(start=MAPINFO, end=MAPINFO, names=Probe_ID)) > > cpgwindows = resize(probecpgs, fix="center", width=500) > > chr.seqs = Views(chr, cpgwindows) > > ocg = sapply(chr.seqs, function(v) { > > dinucleotideFrequency(v,as.prob=T) > > })['CG',] > > c.g = sapply(chr.seqs, function(v) { > > alphabetFrequency(v,as.prob=T,baseOnly=T) > > }) > > ecg = c.g['C',] * c.g['G',] > > ocg/ecg > > }) > > I just took 15000 random intervals on chr10 as the IRanges to > construct cpgwindows and created chr.seqs from these. Consider: > > R> system.time(ocg <- sapply(chr.seqs, function(v) { > dinucleotideFrequency(v, as.prob=TRUE) > })['CG',]) > user system elapsed > 18.583 0.053 19.13 > > vs. > > R> system.time(ocg2 <- dinucleotideFrequency(chr.seqs, > as.prob=TRUE)[,'CG']) > user system elapsed > 0.033 0.000 0.033 > > R> all.equal(ocg, ocg2) > [1] TRUE > > You can do the same trick with your c.g calc, eg. from this: > > R> c.g = sapply(chr.seqs, function(v) { > alphabetFrequency(v,as.prob=T,baseOnly=T) > }) > > to this: > > c.g <- alphabetFrequency(chr.seqs, as.prob=TRUE, baseOnly=TRUE) > > and you should see a nice boost in speed. > > HTH, > > -steve > > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > -- When you emerge in a few years, you can ask someone what you missed, and you'll find it can be summed up in a few minutes. Derek Sivers <http: sivers.org="" berklee=""> [[alternative HTML version deleted]]
0
Entering edit mode
Hi Herve (and thank you), Is there an idiomatic approach that will get met the nearest annotated TSS having an Entrez gene_id? Something along the lines of nearest( cpgsites, txdb[ which(!is.na(elementMetadata(txdb)$gene_id)) ] ) Something like that, but which gives me the desired subset of transcripts (right now I can't get it). I guess the deal with non-coding RNAs is that I should just use the closest transcript (period) but this (walk-upstream-and-get-the-nearest-EG-ID) seems like the sort of problem that you or someone else must have solved years ago. I'd love to take advantage of that if it's the case. Thanks yet again, --t 2011/9/19 Hervé Pagès <hpages@fhcrc.org> > Hi Tim, > > See this email from the UCSC people for why not all UCSC Genes IDs > are mapped to an Entrez Gene ID: > > https://lists.soe.ucsc.edu/**pipermail/genome/2011-April/**025784.h tml<https: lists.soe.ucsc.edu="" pipermail="" genome="" 2011-april="" 025784.html=""> > > HTH, > H. > > > > On 11-09-18 10:52 AM, Tim Triche, Jr. wrote: > >> Yeah, I realized that on Friday and forgot to post it. One question, >> though -- some of the transcripts aren't annotated to a gene (I would also >> be happy with putative or confirmed ncRNAs, miRNAs, etc, even more so if I >> could keep them separate -- is there something like a "knownNcRna" track or >> table outside of UCSC that I should look into for this purpose?). >> >> Should I just throw out all the transcripts without an EntrezGene ID for >> the time being, then circle back and revisit this when I find the >> appropriate resource for non-coding but annotated transcripts? >> >> It seems odd that a table of KnownGene transcripts would lack gene IDs for >> some of the transcripts. >> >> Thanks again for a very useful package, >> >> --t >> >> On Sep 18, 2011, at 10:33 AM, Michael Lawrence<lawrence.michael@**>> gene.com <lawrence.michael@gene.com>> wrote: >> >> >>> >>> On Fri, Sep 16, 2011 at 8:28 PM, Tim Triche, Jr.<ttriche@usc.edu> >>> wrote: >>> OK, so I took your advice and used >>> >>> transcripts(TxDb.Hsapiens.**UCSC.hg19.knownGene::Hsapiens_** >>> UCSC_hg19_knownGene_TxDb) >>> >>> and indeed that is quite handy (got all my TSSes forward and reverse for >>> all my probes in seconds, yay!). Now the question is, how do I use the >>> associated EntrezGene IDs? e.g. the trusty eg.Hs.org.db says... >>> >>> elementMetadata(foo)$tx_name[**1] >>>> >>> [1] "uc001aaa.3" >>> >>>> org.Hs.egSYMBOL[[ elementMetadata(foo)$tx_name[**1] ]] >>>> >>> NULL >>> >>> Two steps forward, one step back.... eventually I will cram all of this >>> into a genoset, though... :-) >>> >>> >>> You do not need to use the org.* packages for this. Just ask >>> transcripts() for the gene_id column. See ?transcripts. >>> >>> Michael >>> >>> thanks! >>> >>> --t >>> >>> >>> >>> On Thu, Sep 15, 2011 at 3:02 PM, Michael Lawrence<lawrence.michael@**>>> gene.com <lawrence.michael@gene.com>> wrote: >>> Easiest path is to convert the RangedData to a GRanges: >>> >>> as(TSS.human.GRCh37, "GRanges") >>> >>> I might recommend though to get the TSS's from >>> GenomicFeatures::transcripts. >>> >>> Michael >>> >>> On Thu, Sep 15, 2011 at 2:28 PM, Tim Triche, Jr.<tim.triche@gmail.com> >>> wrote: >>> I have a GenomicRanges object built from interrogated sites and a >>> RangedData >>> object of human (allegedly canonical) transcription start sites, from >>> Julie >>> Zhu's ChIPpeakAnno package. I want to walk up and down each chromosome >>> and >>> find the nearest forward and reverse strand TSS and their distance from >>> each >>> site. This seems like it would work: >>> >>> nearest(cpgranges, TSS.human.GRCh37) >>>> >>> >>> But one of the objects isn't the right type: >>> >>> Error in function (classes, fdef, mtable) : >>> unable to find an inherited method for function "nearest", for signature >>> "GRanges", "RangedData" >>> >>> What's the right way to solve this problem? I know about follow() and >>> precede(), but those won't work either until I solve this :-) >>> >>> thanks! >>> >>> >>> >>> -- >>> If people do not believe that mathematics is simple, it is only because >>> they >>> do not realize how complicated life is. >>> John von Neumann<http: www-groups.dcs.**st-and.ac.uk="" ~history="" **="">>> Biographies/Von_Neumann.html<http: www-groups.dcs.st-="" and.ac.uk="" ~history="" biographies="" von_neumann.html=""> >>> > >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________**_________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.="" ethz.ch="" mailman="" listinfo="" bioconductor=""> >>> Search the archives: http://news.gmane.org/gmane.** >>> science.biology.informatics.**conductor<http: news.gmane.org="" gman="" e.science.biology.informatics.conductor=""> >>> >>> >>> >>> >>> -- >>> When you emerge in a few years, you can ask someone what you missed, and >>> you'll find it can be summed up in a few minutes. >>> >>> Derek Sivers >>> >>> >>> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> > > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpages@fhcrc.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > -- If people do not believe that mathematics is simple, it is only because they do not realize how complicated life is. John von Neumann<http: www-groups.dcs.st-="" and.ac.uk="" ~history="" biographies="" von_neumann.html=""> [[alternative HTML version deleted]] ADD REPLY 0 Entering edit mode Hi, Tim. If you really need Entrez Gene annotation, I'd suggest using a transcript db derived from refseq and not from UCSC known genes. Otherwise, the workflow will be the same, I believe. Sean 2011/9/20 Tim Triche, Jr. <tim.triche at="" gmail.com="">: > Hi Herve (and thank you), > > Is there an idiomatic approach that will get met the nearest annotated TSS > having an Entrez gene_id? ? Something along the lines of > > nearest( cpgsites, txdb[ which(!is.na(elementMetadata(txdb)$gene_id)) ] ) > > Something like that, but which gives me the desired subset of transcripts > (right now I can't get it). > > I guess the deal with non-coding RNAs is that I should just use the closest > transcript (period) but this (walk-upstream-and-get-the-nearest-EG- ID) seems > like the sort of problem that you or someone else must have solved years > ago. ?I'd love to take advantage of that if it's the case. > > Thanks yet again, > > --t > > > 2011/9/19 Hervé Pagès <hpages at="" fhcrc.org=""> > >> Hi Tim, >> >> See this email from the UCSC people for why not all UCSC Genes IDs >> are mapped to an Entrez Gene ID: >> >> ?https://lists.soe.ucsc.edu/**pipermail/genome/2011-April/**025784. html<https: lists.soe.ucsc.edu="" pipermail="" genome="" 2011-april="" 025784.htm="" l=""> >> >> HTH, >> H. >> >> >> >> On 11-09-18 10:52 AM, Tim Triche, Jr. wrote: >> >>> Yeah, I realized that on Friday and forgot to post it. One question, >>> though -- some of the transcripts aren't annotated to a gene (I would also >>> be happy with putative or confirmed ncRNAs, miRNAs, etc, even more so if I >>> could keep them separate -- is there something like a "knownNcRna" track or >>> table outside of UCSC that I should look into for this purpose?). >>> >>> Should I just throw out all the transcripts without an EntrezGene ID for >>> the time being, then circle back and revisit this when I find the >>> appropriate resource for non-coding but annotated transcripts? >>> >>> It seems odd that a table of KnownGene transcripts would lack gene IDs for >>> some of the transcripts. >>> >>> Thanks again for a very useful package, >>> >>> --t >>> >>> On Sep 18, 2011, at 10:33 AM, Michael Lawrence<lawrence.michael@**>>> gene.com <lawrence.michael at="" gene.com="">> ?wrote: >>> >>> >>>> >>>> On Fri, Sep 16, 2011 at 8:28 PM, Tim Triche, Jr.<ttriche at="" usc.edu=""> >>>> ?wrote: >>>> OK, so I took your advice and used >>>> >>>> transcripts(TxDb.Hsapiens.**UCSC.hg19.knownGene::Hsapiens_** >>>> UCSC_hg19_knownGene_TxDb) >>>> >>>> and indeed that is quite handy (got all my TSSes forward and reverse for >>>> all my probes in seconds, yay!). ?Now the question is, how do I use the >>>> associated EntrezGene IDs? e.g. the trusty eg.Hs.org.db says... >>>> >>>> ?elementMetadata(foo)$tx_name[**1] >>>>> >>>> [1] "uc001aaa.3" >>>> >>>>> org.Hs.egSYMBOL[[ elementMetadata(foo)$tx_name[**1] ]] >>>>> >>>> NULL >>>> >>>> Two steps forward, one step back.... eventually I will cram all of this >>>> into a genoset, though... :-) >>>> >>>> >>>> You do not need to use the org.* packages for this. Just ask >>>> transcripts() for the gene_id column. See ?transcripts. >>>> >>>> Michael >>>> >>>> thanks! >>>> >>>> --t >>>> >>>> >>>> >>>> On Thu, Sep 15, 2011 at 3:02 PM, Michael Lawrence<lawrence.michael@**>>>> gene.com <lawrence.michael at="" gene.com="">> ?wrote: >>>> Easiest path is to convert the RangedData to a GRanges: >>>> >>>> as(TSS.human.GRCh37, "GRanges") >>>> >>>> I might recommend though to get the TSS's from >>>> GenomicFeatures::transcripts. >>>> >>>> Michael >>>> >>>> On Thu, Sep 15, 2011 at 2:28 PM, Tim Triche, Jr.<tim.triche at="" gmail.com=""> >>>> ?wrote: >>>> I have a GenomicRanges object built from interrogated sites and a >>>> RangedData >>>> object of human (allegedly canonical) transcription start sites, from >>>> Julie >>>> Zhu's ChIPpeakAnno package. ?I want to walk up and down each chromosome >>>> and >>>> find the nearest forward and reverse strand TSS and their distance from >>>> each >>>> site. ?This seems like it would work: >>>> >>>> ?nearest(cpgranges, TSS.human.GRCh37) >>>>> >>>> >>>> But one of the objects isn't the right type: >>>> >>>> Error in function (classes, fdef, mtable) ?: >>>> ?unable to find an inherited method for function "nearest", for signature >>>> "GRanges", "RangedData" >>>> >>>> What's the right way to solve this problem? ?I know about follow() and >>>> precede(), but those won't work either until I solve this :-) >>>> >>>> thanks! >>>> >>>> >>>> >>>> -- >>>> If people do not believe that mathematics is simple, it is only because >>>> they >>>> do not realize how complicated life is. >>>> John von Neumann<http: www-groups.dcs.**st-and.ac.uk="" ~history="" **="">>>> Biographies/Von_Neumann.html<http: www-groups.dcs.st-="" and.ac.uk="" ~history="" biographies="" von_neumann.html=""> >>>> > >>>> >>>> ? ? ? ?[[alternative HTML version deleted]] >>>> >>>> ______________________________**_________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat="" .ethz.ch="" mailman="" listinfo="" bioconductor=""> >>>> Search the archives: http://news.gmane.org/gmane.** >>>> science.biology.informatics.**conductor<http: news.gmane.org="" gma="" ne.science.biology.informatics.conductor=""> >>>> >>>> >>>> >>>> >>>> -- >>>> When you emerge in a few years, you can ask someone what you missed, and >>>> you'll find it can be summed up in a few minutes. >>>> >>>> Derek Sivers >>>> >>>> >>>> >>> ? ? ? ?[[alternative HTML version deleted]] >>> >>> ______________________________**_________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.="" ethz.ch="" mailman="" listinfo="" bioconductor=""> >>> Search the archives: http://news.gmane.org/gmane.** >>> science.biology.informatics.**conductor<http: news.gmane.org="" gman="" e.science.biology.informatics.conductor=""> >>> >> >> >> -- >> Hervé Pagès >> >> Program in Computational Biology >> Division of Public Health Sciences >> >> Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N, M1-B514 >> P.O. Box 19024 >> Seattle, WA 98109-1024 >> >> E-mail: hpages at fhcrc.org >> Phone: ?(206) 667-5791 >> Fax: ? ?(206) 667-1319 >> > > > > -- > If people do not believe that mathematics is simple, it is only because they > do not realize how complicated life is. > John von Neumann<http: www-groups.dcs.st-="" and.ac.uk="" ~history="" biographies="" von_neumann.html=""> > > ? ? ? ?[[alternative HTML version deleted]] > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >