RMA in Bioconductor versus APT - missing probesets
1
0
Entering edit mode
Mark Cowley ▴ 910
@mark-cowley-2951
Last seen 10.3 years ago
Michal, in just.rma and rma, it was assumed that each probe could be in at most 1 probeset. once a probe was used, it cannot be reused. on the ST arrays, some probes can be in many probesets... so if you use rma, eventually, all the probes in a probeset have been used once by the time the current probeset needs it & you get NA's. Mark On 24/02/2011, at 8:40 AM, Michal Blazejczyk wrote: > Dear Christian, > > I am aware of the existence of xps. However, we can't use it for our purposes, > largely because it is too complicated to set up (or at least, that was the case > the last time we looked at it). I would still like to know what's happening in > just.rma() :) > > Best, > Micha? > > > > cstrato <cstrato at="" aon.at=""> wrote: >> Dear Michal, > >> As an alternative to just.rma() you could use the Bioconductor package >> xps which uses the Affymetrix PGF-file as well as the Affymetrix >> annotations, and thus should contain all probesets. xps has also a >> vignette, "APTvsXPS.pdf" which compares the results for RMA obtained >> from APT vs xps, respectively, for the HuGene 1.0 ST array. > >> Best regards >> Christian >> _._._._._._._._._._._._._._._._._._ >> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a >> V.i.e.n.n.a A.u.s.t.r.i.a >> e.m.a.i.l: cstrato at aon.at >> _._._._._._._._._._._._._._._._._._ > > >> On 2/23/11 7:06 PM, Michal Blazejczyk wrote: >>> Dear group, >>> >>> I have noticed that Bioconductor's just.rma() function returns fewer transcript-level >>> probesets that RMA in APT for the Human Gene 1.0 ST array. To be specific, 819 probesets >>> are missing, and most of them seem to be "real", i.e. they are annotated when I run them >>> through NetAffx. >>> >>> I would like to know why this is happening, and whether it is to be expected or maybe >>> it is a bug. >>> >>> Best regards, >>> >>> Micha? B?a?ejczyk >>> FlexArray Lead Developer >>> McGill University and Genome Quebec Innovation Centre >>> http://www.gqinnovationcenter.com/services/bioinformatics/flexarra y/index.aspx?l=e >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
probe xps probe xps • 1.3k views
ADD COMMENT
0
Entering edit mode
@michal-blazejczyk-2231
Last seen 10.3 years ago
Dear Mark, Thank you for your answer. Please correct me if I'm getting the wrong impression, but doesn't this mean that just.rma() and rma() are simply wrong in this case? And if that's the case then should they be used for ST data? In previous versions of Biocionductor they simply did not work (there was no cdf environment) but now that they do users will be using them, generating results that are not complete... Best, Micha? Mark Cowley <m.cowley at="" garvan.org.au=""> wrote: > Michal, > in just.rma and rma, it was assumed that each probe could be in at most 1 > probeset. once a probe was used, it cannot be reused. > on the ST arrays, some probes can be in many probesets... so if you use rma, > eventually, all the probes in a probeset have been used once by the time the > current probeset needs it & you get NA's. > Mark > On 24/02/2011, at 8:40 AM, Michal Blazejczyk wrote: >> Dear Christian, >> >> I am aware of the existence of xps. However, we can't use it for our purposes, >> largely because it is too complicated to set up (or at least, that was the case >> the last time we looked at it). I would still like to know what's happening in >> just.rma() :) >> >> Best, >> Micha? >> >> >> >> cstrato <cstrato at="" aon.at=""> wrote: >>> Dear Michal, >> >>> As an alternative to just.rma() you could use the Bioconductor package >>> xps which uses the Affymetrix PGF-file as well as the Affymetrix >>> annotations, and thus should contain all probesets. xps has also a >>> vignette, "APTvsXPS.pdf" which compares the results for RMA obtained >>> from APT vs xps, respectively, for the HuGene 1.0 ST array. >> >>> Best regards >>> Christian >>> _._._._._._._._._._._._._._._._._._ >>> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a >>> V.i.e.n.n.a A.u.s.t.r.i.a >>> e.m.a.i.l: cstrato at aon.at >>> _._._._._._._._._._._._._._._._._._ >> >> >>> On 2/23/11 7:06 PM, Michal Blazejczyk wrote: >>>> Dear group, >>>> >>>> I have noticed that Bioconductor's just.rma() function returns fewer transcript-level >>>> probesets that RMA in APT for the Human Gene 1.0 ST array. To be specific, 819 probesets >>>> are missing, and most of them seem to be "real", i.e. they are annotated when I run them >>>> through NetAffx. >>>> >>>> I would like to know why this is happening, and whether it is to be expected or maybe >>>> it is a bug. >>>> >>>> Best regards, >>>> >>>> Micha? B?a?ejczyk >>>> FlexArray Lead Developer >>>> McGill University and Genome Quebec Innovation Centre >>>> http://www.gqinnovationcenter.com/services/bioinformatics/flexarr ay/index.aspx?l=e >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Consider the oligo stack. > dir()[c(63,65,67)] [1] "TisMix_mix8_03_v1_WTGene1.CEL" "TisMix_mix9_01_v1_WTGene1.CEL" [3] "TisMix_mix9_02_v1_WTGene1.CEL" > xo = oligo::read.celfiles(filenames=dir()[c(63,65,67)]) Platform design info loaded. > xa = affy::ReadAffy(filenames=dir()[c(63,65,67)]) > dim(exprs(xo)) [1] 1102500 3 > dim(exprs(xa)) [1] 1102500 3 > rxo = rma(xo) Error in function (classes, fdef, mtable) : unable to find an inherited method for function "probeNames", for signature "GeneFeatureSet" > rxo = oligo::rma(xo) Background correcting... OK Normalizing... OK Summarizing... OK > rxa = affy::rma(xa) Background correcting Normalizing Calculating Expression > dim(exprs(rxo)) [1] 33297 3 > dim(exprs(rxa)) [1] 32321 3 > sessionInfo() R version 2.13.0 Under development (unstable) (2011-03-01 r54628) Platform: x86_64-apple-darwin10.4.0/x86_64 (64-bit) locale: [1] C attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] hugene10stv1cdf_2.7.1 affyio_1.19.2 [3] affy_1.29.1 pd.hugene.1.0.st.v1_3.0.2 [5] RSQLite_0.9-4 DBI_0.2-5 [7] ff_2.2-1 bit_1.1-6 [9] oligo_1.15.1 oligoClasses_1.13.8 [11] Biobase_2.11.9 loaded via a namespace (and not attached): [1] AnnotationDbi_1.13.13 Biostrings_2.19.11 IRanges_1.9.25 [4] affxparser_1.23.0 preprocessCore_1.13.3 splines_2.13.0 2011/3/3 Michal Blazejczyk <michal.blazejczyk at="" mail.mcgill.ca="">: > Dear Mark, > > Thank you for your answer. > > Please correct me if I'm getting the wrong impression, but doesn't this mean > that just.rma() and rma() are simply wrong in this case? ?And if that's the case > then should they be used for ST data? ?In previous versions of Biocionductor they > simply did not work (there was no cdf environment) but now that they do users will > be using them, generating results that are not complete... > > Best, > Micha? > > > > Mark Cowley <m.cowley at="" garvan.org.au=""> wrote: >> Michal, >> in just.rma and rma, it was assumed that each probe could be in at most 1 >> probeset. once a probe was used, it cannot be reused. >> on the ST arrays, some probes can be in many probesets... so if you use rma, >> eventually, all the probes in a probeset have been used once by the time the >> current probeset needs it & you get NA's. > >> Mark > >> On 24/02/2011, at 8:40 AM, Michal Blazejczyk wrote: > >>> Dear Christian, >>> >>> I am aware of the existence of xps. ?However, we can't use it for our purposes, >>> largely because it is too complicated to set up (or at least, that was the case >>> the last time we looked at it). ?I would still like to know what's happening in >>> just.rma() ?:) >>> >>> Best, >>> Micha? >>> >>> >>> >>> cstrato <cstrato at="" aon.at=""> wrote: >>>> Dear Michal, >>> >>>> As an alternative to just.rma() you could use the Bioconductor package >>>> xps which uses the Affymetrix PGF-file as well as the Affymetrix >>>> annotations, and thus should contain all probesets. xps has also a >>>> vignette, "APTvsXPS.pdf" which compares the results for RMA obtained >>>> from APT vs xps, respectively, for the HuGene 1.0 ST array. >>> >>>> Best regards >>>> Christian >>>> _._._._._._._._._._._._._._._._._._ >>>> C.h.r.i.s.t.i.a.n ? S.t.r.a.t.o.w.a >>>> V.i.e.n.n.a ? ? ? ? ? A.u.s.t.r.i.a >>>> e.m.a.i.l: ? ? ? ?cstrato at aon.at >>>> _._._._._._._._._._._._._._._._._._ >>> >>> >>>> On 2/23/11 7:06 PM, Michal Blazejczyk wrote: >>>>> Dear group, >>>>> >>>>> I have noticed that Bioconductor's just.rma() function returns fewer transcript-level >>>>> probesets that RMA in APT for the Human Gene 1.0 ST array. ?To be specific, 819 probesets >>>>> are missing, and most of them seem to be "real", i.e. they are annotated when I run them >>>>> through NetAffx. >>>>> >>>>> I would like to know why this is happening, and whether it is to be expected or maybe >>>>> it is a bug. >>>>> >>>>> Best regards, >>>>> >>>>> Micha? B?a?ejczyk >>>>> FlexArray Lead Developer >>>>> McGill University and Genome Quebec Innovation Centre >>>>> http://www.gqinnovationcenter.com/services/bioinformatics/flexar ray/index.aspx?l=e >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at r-project.org >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
Hi Michal, yes, I think it's wrong to use rma/just.rma on ST data -- since working this out, I never do (except for using existing array QC pipelines which rely on 'rma' where i don't care about a few missing/erroneous probesets). Thus consider these excellent alternatives: oligo, XPS, or affymetrix- probeset-summarize. I currently use oligo because I can write pure R code with no dependencies on ROOT, but I will probably switch to XPS, because once its installed, the same pipeline can handle ST arrays and older genome arrays & can calculate DABG calls on the ST arrays my 2 cents Mark On 04/03/2011, at 2:24 AM, Michal Blazejczyk wrote: > Dear Mark, > > Thank you for your answer. > > Please correct me if I'm getting the wrong impression, but doesn't this mean > that just.rma() and rma() are simply wrong in this case? And if that's the case > then should they be used for ST data? In previous versions of Biocionductor they > simply did not work (there was no cdf environment) but now that they do users will > be using them, generating results that are not complete... > > Best, > Micha? > > > > Mark Cowley <m.cowley at="" garvan.org.au=""> wrote: >> Michal, >> in just.rma and rma, it was assumed that each probe could be in at most 1 >> probeset. once a probe was used, it cannot be reused. >> on the ST arrays, some probes can be in many probesets... so if you use rma, >> eventually, all the probes in a probeset have been used once by the time the >> current probeset needs it & you get NA's. > >> Mark > >> On 24/02/2011, at 8:40 AM, Michal Blazejczyk wrote: > >>> Dear Christian, >>> >>> I am aware of the existence of xps. However, we can't use it for our purposes, >>> largely because it is too complicated to set up (or at least, that was the case >>> the last time we looked at it). I would still like to know what's happening in >>> just.rma() :) >>> >>> Best, >>> Micha? >>> >>> >>> >>> cstrato <cstrato at="" aon.at=""> wrote: >>>> Dear Michal, >>> >>>> As an alternative to just.rma() you could use the Bioconductor package >>>> xps which uses the Affymetrix PGF-file as well as the Affymetrix >>>> annotations, and thus should contain all probesets. xps has also a >>>> vignette, "APTvsXPS.pdf" which compares the results for RMA obtained >>>> from APT vs xps, respectively, for the HuGene 1.0 ST array. >>> >>>> Best regards >>>> Christian >>>> _._._._._._._._._._._._._._._._._._ >>>> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a >>>> V.i.e.n.n.a A.u.s.t.r.i.a >>>> e.m.a.i.l: cstrato at aon.at >>>> _._._._._._._._._._._._._._._._._._ >>> >>> >>>> On 2/23/11 7:06 PM, Michal Blazejczyk wrote: >>>>> Dear group, >>>>> >>>>> I have noticed that Bioconductor's just.rma() function returns fewer transcript-level >>>>> probesets that RMA in APT for the Human Gene 1.0 ST array. To be specific, 819 probesets >>>>> are missing, and most of them seem to be "real", i.e. they are annotated when I run them >>>>> through NetAffx. >>>>> >>>>> I would like to know why this is happening, and whether it is to be expected or maybe >>>>> it is a bug. >>>>> >>>>> Best regards, >>>>> >>>>> Micha? B?a?ejczyk >>>>> FlexArray Lead Developer >>>>> McGill University and Genome Quebec Innovation Centre >>>>> http://www.gqinnovationcenter.com/services/bioinformatics/flexar ray/index.aspx?l=e >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor at r-project.org >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 373 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6