pd.hugene.1.0.st.v1
2
0
Entering edit mode
Mark Robinson ★ 1.1k
@mark-robinson-2171
Last seen 9.7 years ago
Hi Vince. Thanks for the reply. That's good to know. But, it only allows me to access the indices, not to actually compute gene-level summaries, right? Any way to do that without building the package from scratch? Cheers, Mark On 31/07/2009, at 10:10 PM, Vincent Carey wrote: > On Fri, Jul 31, 2009 at 12:48 AM, Mark > Robinson<mrobinson at="" wehi.edu.au=""> wrote: >> Hi all. >> >> I wonder if its makes more sense to have the *transcript* version >> of this >> package, instead of the *probeset* version available when you >> install via: >> > > This merits further discussion. Note that under the current approach > you can obtain > the transcript cluster indices for summarization using fData on the > output of rma > >> class(tismix) > [1] "GeneFeatureSet" > attr(,"package") > [1] "oligoClasses" >> class(tismixRMA) > [1] "ExpressionSet" > attr(,"package") > [1] "Biobase" >> fData(tismixRMA)[1:4,] > fsetid exon_id transcript_cluster_id level crosshyb_type > chrom > 7896737 7896737 96595542 7896736 NA > 3 1 > 7896739 7896739 96595544 7896738 NA > 3 1 > 7896741 7896741 96595546 7896740 NA > 3 1 > 7896743 7896743 96595548 7896742 NA > 3 1 > > accessions > 7896737 > <na> > 7896739 > <na> > 7896741 > BC136848 > ,BC136907,ENST00000318050,ENST00000326183,ENST00000335137,NM_001 > 004195,NM_001005240,NM_001005484 > 7896743 > BC118988,ENST00000279067 > >> dim(fData(tismixRMA)) > [1] 253002 7 >> dim(exprs(tismixRMA)) > [1] 253002 33 > > annotation packages are available at both the probescript and > transcript cluster level, thanks > to folks at city of hope (e.g., > http://www.bioconductor.org/packages/release/data/annotation/html/hu gene10sttranscriptcluster.db.html) > > >> source("http://bioconductor.org/biocLite.R") >> biocLite("pd.hugene.1.0.st.v1") >> >> It seems like as a default, more people would want gene-level >> summaries for >> these arrays ... especially since ~200k (~80%) of the probesets >> have 3 >> probes or less. >> >> Of course I (and everyone around the world) could build this >> package locally >> from scratch using the transcript CSV, but it seems like there >> would be >> enough demand for this to make available direct from BioC. Just a >> thought. >> Does anyone agree? >> >> Or, am I missing something that will allow me to do gene-level >> analysis from >> this package? >> >> My session is below. >> >> Thanks in advance. >> Mark >> >> >> >> ---------------------- >> mac1618:Desktop mrobinson$ wc -l HuGene-1_0-st-v1.na29.*.csv >> 257449 HuGene-1_0-st-v1.na29.hg18.probeset.csv >> 33317 HuGene-1_0-st-v1.na29.hg18.transcript.csv >> ---------------------- >> >> >> ---------------------- >>> library(oligo) >> Loading required package: oligoClasses >> Loading required package: Biobase >> >> Welcome to Bioconductor >> >> Vignettes contain introductory material. To view, type >> 'openVignette()'. To cite Bioconductor, see >> 'citation("Biobase")' and for packages 'citation(pkgname)'. >> >> Loading required package: preprocessCore >> Welcome to oligo version 1.8.1 >>> cf <- dir(celPath,"CEL") >>> fs <- read.celfiles( file.path(celPath,cf) ) >> Loading required package: pd.hugene.1.0.st.v1 >> Loading required package: RSQLite >> Loading required package: DBI >> Platform design info loaded. >> Reading in : rawData/cell_line/HuGene-1_0-st-v1//cancer1.CEL >> Reading in : rawData/cell_line/HuGene-1_0-st-v1//cancer2.CEL >> Reading in : rawData/cell_line/HuGene-1_0-st-v1//normal1.CEL >> Reading in : rawData/cell_line/HuGene-1_0-st-v1//normal2.CEL >>> rmaOligo <- oligo::rma(fs) >> Background correcting >> Normalizing >> Calculating Expression >> dmOligo <- exprs(rmaOligo) >> dim(rmaOligo) >>> dmOligo <- exprs(rmaOligo) >>> dim(rmaOligo) >> Features Samples >> 253002 4 >>> sessionInfo() >> R version 2.9.0 (2009-04-17) >> i386-apple-darwin8.11.1 >> >> locale: >> en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] pd.hugene.1.0.st.v1_2.4.1 RSQLite_0.7-1 >> [3] DBI_0.2-4 oligo_1.8.1 >> [5] preprocessCore_1.6.0 oligoClasses_1.6.0 >> [7] Biobase_2.4.1 >> >> loaded via a namespace (and not attached): >> [1] affxparser_1.15.6 affyio_1.12.0 Biostrings_2.12.1 >> IRanges_1.2.2 >> [5] splines_2.9.0 >> ---------------------- >> >> >> >> >> >> >> >> ------------------------------ >> Mark Robinson, PhD (Melb) >> Epigenetics Laboratory, Garvan >> Bioinformatics Division, WEHI >> e: m.robinson at garvan.org.au >> e: mrobinson at wehi.edu.au >> p: +61 (0)3 9345 2628 >> f: +61 (0)3 9347 0852 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > > -- > Vincent Carey, PhD > Biostatistics, Channing Lab > 617 525 2265 ------------------------------ Mark Robinson, PhD (Melb) Epigenetics Laboratory, Garvan Bioinformatics Division, WEHI e: m.robinson at garvan.org.au e: mrobinson at wehi.edu.au p: +61 (0)3 9345 2628 f: +61 (0)3 9347 0852
Epigenetics Annotation oligo Epigenetics Annotation oligo • 2.3k views
ADD COMMENT
0
Entering edit mode
@benilton-carvalho-1375
Last seen 4.1 years ago
Brazil/Campinas/UNICAMP
Mark, I'm planning on providing an updated version of thhe annotation pkgs that will allow gene-level summarization in about 1 week (maybe earlier). b -- Sent from my iPhone On Jul 31, 2009, at 7:20 PM, "Mark Robinson" <mrobinson at="" wehi.edu.au=""> wrote: > Hi Vince. > > Thanks for the reply. > > That's good to know. But, it only allows me to access the indices, > not to actually compute gene-level summaries, right? Any way to do > that without building the package from scratch? > > Cheers, > Mark > > On 31/07/2009, at 10:10 PM, Vincent Carey wrote: > >> On Fri, Jul 31, 2009 at 12:48 AM, Mark >> Robinson<mrobinson at="" wehi.edu.au=""> wrote: >>> Hi all. >>> >>> I wonder if its makes more sense to have the *transcript* version >>> of this >>> package, instead of the *probeset* version available when you >>> install via: >>> >> >> This merits further discussion. Note that under the current approach >> you can obtain >> the transcript cluster indices for summarization using fData on the >> output of rma >> >>> class(tismix) >> [1] "GeneFeatureSet" >> attr(,"package") >> [1] "oligoClasses" >>> class(tismixRMA) >> [1] "ExpressionSet" >> attr(,"package") >> [1] "Biobase" >>> fData(tismixRMA)[1:4,] >> fsetid exon_id transcript_cluster_id level crosshyb_type >> chrom >> 7896737 7896737 96595542 7896736 NA >> 3 1 >> 7896739 7896739 96595544 7896738 NA >> 3 1 >> 7896741 7896741 96595546 7896740 NA >> 3 1 >> 7896743 7896743 96595548 7896742 NA >> 3 1 >> >> accessions >> 7896737 >> <na> >> 7896739 >> <na> >> 7896741 >> BC136848 >> ,BC136907,ENST00000318050,ENST00000326183,ENST00000335137,NM_001 >> 004195,NM_001005240,NM_001005484 >> 7896743 >> BC118988,ENST00000279067 >> >>> dim(fData(tismixRMA)) >> [1] 253002 7 >>> dim(exprs(tismixRMA)) >> [1] 253002 33 >> >> annotation packages are available at both the probescript and >> transcript cluster level, thanks >> to folks at city of hope (e.g., >> http://www.bioconductor.org/packages/release/data/annotation/html/h ugene10sttranscriptcluster.db.html >> ) >> >> >>> source("http://bioconductor.org/biocLite.R") >>> biocLite("pd.hugene.1.0.st.v1") >>> >>> It seems like as a default, more people would want gene-level >>> summaries for >>> these arrays ... especially since ~200k (~80%) of the probesets >>> have 3 >>> probes or less. >>> >>> Of course I (and everyone around the world) could build this >>> package locally >>> from scratch using the transcript CSV, but it seems like there >>> would be >>> enough demand for this to make available direct from BioC. Just a >>> thought. >>> Does anyone agree? >>> >>> Or, am I missing something that will allow me to do gene-level >>> analysis from >>> this package? >>> >>> My session is below. >>> >>> Thanks in advance. >>> Mark >>> >>> >>> >>> ---------------------- >>> mac1618:Desktop mrobinson$ wc -l HuGene-1_0-st-v1.na29.*.csv >>> 257449 HuGene-1_0-st-v1.na29.hg18.probeset.csv >>> 33317 HuGene-1_0-st-v1.na29.hg18.transcript.csv >>> ---------------------- >>> >>> >>> ---------------------- >>>> library(oligo) >>> Loading required package: oligoClasses >>> Loading required package: Biobase >>> >>> Welcome to Bioconductor >>> >>> Vignettes contain introductory material. To view, type >>> 'openVignette()'. To cite Bioconductor, see >>> 'citation("Biobase")' and for packages 'citation(pkgname)'. >>> >>> Loading required package: preprocessCore >>> Welcome to oligo version 1.8.1 >>>> cf <- dir(celPath,"CEL") >>>> fs <- read.celfiles( file.path(celPath,cf) ) >>> Loading required package: pd.hugene.1.0.st.v1 >>> Loading required package: RSQLite >>> Loading required package: DBI >>> Platform design info loaded. >>> Reading in : rawData/cell_line/HuGene-1_0-st-v1//cancer1.CEL >>> Reading in : rawData/cell_line/HuGene-1_0-st-v1//cancer2.CEL >>> Reading in : rawData/cell_line/HuGene-1_0-st-v1//normal1.CEL >>> Reading in : rawData/cell_line/HuGene-1_0-st-v1//normal2.CEL >>>> rmaOligo <- oligo::rma(fs) >>> Background correcting >>> Normalizing >>> Calculating Expression >>> dmOligo <- exprs(rmaOligo) >>> dim(rmaOligo) >>>> dmOligo <- exprs(rmaOligo) >>>> dim(rmaOligo) >>> Features Samples >>> 253002 4 >>>> sessionInfo() >>> R version 2.9.0 (2009-04-17) >>> i386-apple-darwin8.11.1 >>> >>> locale: >>> en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] pd.hugene.1.0.st.v1_2.4.1 RSQLite_0.7-1 >>> [3] DBI_0.2-4 oligo_1.8.1 >>> [5] preprocessCore_1.6.0 oligoClasses_1.6.0 >>> [7] Biobase_2.4.1 >>> >>> loaded via a namespace (and not attached): >>> [1] affxparser_1.15.6 affyio_1.12.0 Biostrings_2.12.1 >>> IRanges_1.2.2 >>> [5] splines_2.9.0 >>> ---------------------- >>> >>> >>> >>> >>> >>> >>> >>> ------------------------------ >>> Mark Robinson, PhD (Melb) >>> Epigenetics Laboratory, Garvan >>> Bioinformatics Division, WEHI >>> e: m.robinson at garvan.org.au >>> e: mrobinson at wehi.edu.au >>> p: +61 (0)3 9345 2628 >>> f: +61 (0)3 9347 0852 >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> >> >> -- >> Vincent Carey, PhD >> Biostatistics, Channing Lab >> 617 525 2265 > > ------------------------------ > Mark Robinson, PhD (Melb) > Epigenetics Laboratory, Garvan > Bioinformatics Division, WEHI > e: m.robinson at garvan.org.au > e: mrobinson at wehi.edu.au > p: +61 (0)3 9345 2628 > f: +61 (0)3 9347 0852 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
cstrato ★ 3.9k
@cstrato-908
Last seen 5.6 years ago
Austria
Dear Mark, I am not sure, but maybe you could use the old annotation package, which I believe was built for release 3 of the HuGene array, see: http://www.bioconductor.org/packages/2.3/data/annotation/html/hugene10 st.db.html Alternatively, you could use package xps, which allows you to compute both gene-level summaries and probeset-level summaries. Best regards Christian _._._._._._._._._._._._._._._._._._ C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a e.m.a.i.l: cstrato at aon.at _._._._._._._._._._._._._._._._._._ Mark Robinson wrote: > Hi Vince. > > Thanks for the reply. > > That's good to know. But, it only allows me to access the indices, > not to actually compute gene-level summaries, right? Any way to do > that without building the package from scratch? > > Cheers, > Mark > > On 31/07/2009, at 10:10 PM, Vincent Carey wrote: > >> On Fri, Jul 31, 2009 at 12:48 AM, Mark >> Robinson<mrobinson at="" wehi.edu.au=""> wrote: >>> Hi all. >>> >>> I wonder if its makes more sense to have the *transcript* version of >>> this >>> package, instead of the *probeset* version available when you >>> install via: >>> >> >> This merits further discussion. Note that under the current approach >> you can obtain >> the transcript cluster indices for summarization using fData on the >> output of rma >> >>> class(tismix) >> [1] "GeneFeatureSet" >> attr(,"package") >> [1] "oligoClasses" >>> class(tismixRMA) >> [1] "ExpressionSet" >> attr(,"package") >> [1] "Biobase" >>> fData(tismixRMA)[1:4,] >> fsetid exon_id transcript_cluster_id level crosshyb_type chrom >> 7896737 7896737 96595542 7896736 NA 3 1 >> 7896739 7896739 96595544 7896738 NA 3 1 >> 7896741 7896741 96595546 7896740 NA 3 1 >> 7896743 7896743 96595548 7896742 NA 3 1 >> >> accessions >> 7896737 >> <na> >> 7896739 >> <na> >> 7896741 >> BC136848,BC136907,ENST00000318050,ENST00000326183,ENST00000335137,N M_001 >> 004195,NM_001005240,NM_001005484 >> 7896743 >> BC118988,ENST00000279067 >> >>> dim(fData(tismixRMA)) >> [1] 253002 7 >>> dim(exprs(tismixRMA)) >> [1] 253002 33 >> >> annotation packages are available at both the probescript and >> transcript cluster level, thanks >> to folks at city of hope (e.g., >> http://www.bioconductor.org/packages/release/data/annotation/html/h ugene10sttranscriptcluster.db.html) >> >> >> >>> source("http://bioconductor.org/biocLite.R") >>> biocLite("pd.hugene.1.0.st.v1") >>> >>> It seems like as a default, more people would want gene-level >>> summaries for >>> these arrays ... especially since ~200k (~80%) of the probesets have 3 >>> probes or less. >>> >>> Of course I (and everyone around the world) could build this package >>> locally >>> from scratch using the transcript CSV, but it seems like there would be >>> enough demand for this to make available direct from BioC. Just a >>> thought. >>> Does anyone agree? >>> >>> Or, am I missing something that will allow me to do gene-level >>> analysis from >>> this package? >>> >>> My session is below. >>> >>> Thanks in advance. >>> Mark >>> >>> >>> >>> ---------------------- >>> mac1618:Desktop mrobinson$ wc -l HuGene-1_0-st-v1.na29.*.csv >>> 257449 HuGene-1_0-st-v1.na29.hg18.probeset.csv >>> 33317 HuGene-1_0-st-v1.na29.hg18.transcript.csv >>> ---------------------- >>> >>> >>> ---------------------- >>>> library(oligo) >>> Loading required package: oligoClasses >>> Loading required package: Biobase >>> >>> Welcome to Bioconductor >>> >>> Vignettes contain introductory material. To view, type >>> 'openVignette()'. To cite Bioconductor, see >>> 'citation("Biobase")' and for packages 'citation(pkgname)'. >>> >>> Loading required package: preprocessCore >>> Welcome to oligo version 1.8.1 >>>> cf <- dir(celPath,"CEL") >>>> fs <- read.celfiles( file.path(celPath,cf) ) >>> Loading required package: pd.hugene.1.0.st.v1 >>> Loading required package: RSQLite >>> Loading required package: DBI >>> Platform design info loaded. >>> Reading in : rawData/cell_line/HuGene-1_0-st-v1//cancer1.CEL >>> Reading in : rawData/cell_line/HuGene-1_0-st-v1//cancer2.CEL >>> Reading in : rawData/cell_line/HuGene-1_0-st-v1//normal1.CEL >>> Reading in : rawData/cell_line/HuGene-1_0-st-v1//normal2.CEL >>>> rmaOligo <- oligo::rma(fs) >>> Background correcting >>> Normalizing >>> Calculating Expression >>> dmOligo <- exprs(rmaOligo) >>> dim(rmaOligo) >>>> dmOligo <- exprs(rmaOligo) >>>> dim(rmaOligo) >>> Features Samples >>> 253002 4 >>>> sessionInfo() >>> R version 2.9.0 (2009-04-17) >>> i386-apple-darwin8.11.1 >>> >>> locale: >>> en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] pd.hugene.1.0.st.v1_2.4.1 RSQLite_0.7-1 >>> [3] DBI_0.2-4 oligo_1.8.1 >>> [5] preprocessCore_1.6.0 oligoClasses_1.6.0 >>> [7] Biobase_2.4.1 >>> >>> loaded via a namespace (and not attached): >>> [1] affxparser_1.15.6 affyio_1.12.0 Biostrings_2.12.1 IRanges_1.2.2 >>> [5] splines_2.9.0 >>> ---------------------- >>> >>> >>> >>> >>> >>> >>> >>> ------------------------------ >>> Mark Robinson, PhD (Melb) >>> Epigenetics Laboratory, Garvan >>> Bioinformatics Division, WEHI >>> e: m.robinson at garvan.org.au >>> e: mrobinson at wehi.edu.au >>> p: +61 (0)3 9345 2628 >>> f: +61 (0)3 9347 0852 >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> >> >> -- >> Vincent Carey, PhD >> Biostatistics, Channing Lab >> 617 525 2265 > > ------------------------------ > Mark Robinson, PhD (Melb) > Epigenetics Laboratory, Garvan > Bioinformatics Division, WEHI > e: m.robinson at garvan.org.au > e: mrobinson at wehi.edu.au > p: +61 (0)3 9345 2628 > f: +61 (0)3 9347 0852 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD COMMENT

Login before adding your answer.

Traffic: 647 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6