GCRMA: feature request
2
0
Entering edit mode
Guido Hooiveld ★ 4.1k
@guido-hooiveld-2020
Last seen 11 hours ago
Wageningen University, Wageningen, the …
Dear Jean, Please allow me to put forward a feature request for the GCRMA package: as you may have noticed in literature and on the BioC mailing list, there has been discussion on the use of the median polish algorithm (in RMA) for summerizing signals of probes into a single probeset value in relation to correlation artefacts. See e.g.: http://www.biomedcentral.com/1471-2105/11/553 and http://thread.gmane.org/gmane.science.biology.informatics.conductor/32 255/focus=32259 Moreover, in the above-mentioned BioC thread it is advocated to use a robust regression M-estimation procedure, e.g. available in the package 'affyPLM' / 'preprocessCore', instead of applying median polish on the transposed data matrix (aka tRMA), as was suggested by the authors of before-mentioned paper. In the intro of the fRMA paper it is also stated that more statistically rigorous procedures such as M-estimation techniques could be used for summerization, and this is one of the reasons fRMA by default uses AffyPLM's default M-estimator (Huber) for summerization instead of median polish. http://dx.doi.org/10.1093/biostatistics/kxp059 Since AFAIK GCRMA is equal to RMA, except of course for the background correction, I wondered whether it would be possible to build in GCRMA the option to give a user the possibility to select a robust M-estimation procedure (e.g. affyPLM's default one) over the (GCRMA's default) median polish algorithm to summerize the probe data into a probeset values. Thus something like: x.norm <- gcrma(affy.data, sum="median.polish") [default] or x.norm <- gcrma(affy.data, sum="affyPLM"). I would appreciate your opinion on this. Regards, Guido In addition, i would like to remind you about another issue with GCRMA my collegue Philip brought forward last December, which you may have missed (i copied his email below): ---------------------------------------------------------------------- ------- I noticed the following problem when using gcRMA. the gcRMA-library tries to install probe packages. This is fine, except in cases when a probe-package is already available (and local versions vs repository versions do not necessarily match)! This behaviour is triggered within the function getProbePackage: function (probepackage, lib = .libPaths()[1], verbose = TRUE) { options(show.error.messages = FALSE) attempt <- try(do.call(library, list(probepackage, lib.loc = lib))) options(show.error.messages = TRUE) ... } .libPaths() is in this particular example: > .libPaths() [1] "/local/home/guidoh/R/x86_64-unknown-linux-gnu-library/2.12" [2] "/geninf/prog64/R/R-2.12.0/lib64/R/library" As you can see, .libPaths()[1] point the the local R directory of the user, whereas the R installation directory is in .libPaths()[2]. Hence, we have the complication that gcRMA installs (the wrong) probe libraries (from BioC) that are already available to the user! The issue with this is that in some cases we use custom, tailored libraries that are not identical to those in the repositories. Hence, we may run into unexpected problems! As a matter of fact, I prefer to simply disable the ability (in gcRMA) to automatically install probe packages in the first place (just an option that is enabled by default, but can be disabled by the user)! Anyway, there is no reason to limit yourself to the first library. As an example, the following command will work without any problem: > attempt <- try(do.call(library, list("nugohs1a520180hsentrezgcdf", lib.loc = .libPaths()))) > attempt [1] "gcrma" "nugohs1a520180hsentrezgcdf" [3] "affy" "Biobase" [5] "stats" "graphics" [7] "grDevices" "utils" [9] "datasets" "methods" [11] "base" At least your function will really go through all R library directories to search whether or not a library is installed! So I kindly ask for the following modifications: 1. An option in gcRMA to simply disable the automated installation of missing libraries [I need to control what happens! :)] 2. To simply use .libPaths() instead of .libPaths()[1] to really search through all R installation directories. Please let me know whether or not you agree. Doing these 2 modifications are not very hard, so I can contribute it to you if you are interested. Regards, Philip [[alternative HTML version deleted]]
GO Regression probe gcrma frma GO Regression probe gcrma frma • 1.5k views
ADD COMMENT
0
Entering edit mode
Ben Bolstad ★ 1.2k
@ben-bolstad-1494
Last seen 7.2 years ago
At least in theory you should be able to accomplish this also via affyPLM's threestep() function, though I have not tested against any modifications that might have occured to GCRMA in the last 3-4 years. eg eset <- threestep(abatch,background.method="GCRMA",normalize.method="quantile" ,summary.method="rlm") also fitPLM() should accept the background.method="GCRMA" argument. > Dear Jean, > > Please allow me to put forward a feature request for the GCRMA package: > as you may have noticed in literature and on the BioC mailing list, there > has been discussion on the use of the median polish algorithm (in RMA) for > summerizing signals of probes into a single probeset value in relation to > correlation artefacts. See e.g.: > http://www.biomedcentral.com/1471-2105/11/553 > and > http://thread.gmane.org/gmane.science.biology.informatics.conductor/ 32255/focus=32259 > > Moreover, in the above-mentioned BioC thread it is advocated to use a > robust regression M-estimation procedure, e.g. available in the package > 'affyPLM' / 'preprocessCore', instead of applying median polish on the > transposed data matrix (aka tRMA), as was suggested by the authors of > before-mentioned paper. > > In the intro of the fRMA paper it is also stated that more statistically > rigorous procedures such as M-estimation techniques could be used for > summerization, and this is one of the reasons fRMA by default uses > AffyPLM's default M-estimator (Huber) for summerization instead of median > polish. > http://dx.doi.org/10.1093/biostatistics/kxp059 > > Since AFAIK GCRMA is equal to RMA, except of course for the background > correction, I wondered whether it would be possible to build in GCRMA the > option to give a user the possibility to select a robust M-estimation > procedure (e.g. affyPLM's default one) over the (GCRMA's default) median > polish algorithm to summerize the probe data into a probeset values. Thus > something like: > x.norm <- gcrma(affy.data, sum="median.polish") [default] or x.norm <- > gcrma(affy.data, sum="affyPLM"). > > I would appreciate your opinion on this. > > Regards, > Guido > > > In addition, i would like to remind you about another issue with GCRMA my > collegue Philip brought forward last December, which you may have missed > (i copied his email below): > -------------------------------------------------------------------- --------- > I noticed the following problem when using gcRMA. the gcRMA-library tries > to install probe packages. This is fine, except in cases when a > probe-package is already available (and local versions vs repository > versions do not necessarily match)! This behaviour is triggered within the > function getProbePackage: > > function (probepackage, lib = .libPaths()[1], verbose = TRUE) > { > options(show.error.messages = FALSE) > attempt <- try(do.call(library, list(probepackage, lib.loc = lib))) > options(show.error.messages = TRUE) > > ... > } > > .libPaths() is in this particular example: >> .libPaths() > [1] "/local/home/guidoh/R/x86_64-unknown-linux-gnu-library/2.12" > [2] "/geninf/prog64/R/R-2.12.0/lib64/R/library" > > As you can see, .libPaths()[1] point the the local R directory of the > user, whereas the R installation directory is in .libPaths()[2]. Hence, we > have the complication that gcRMA installs (the wrong) probe libraries > (from BioC) that are already available to the user! The issue with this is > that in some cases we use custom, tailored libraries that are not > identical to those in the repositories. Hence, we may run into unexpected > problems! As a matter of fact, I prefer to simply disable the ability (in > gcRMA) to automatically install probe packages in the first place (just an > option that is enabled by default, but can be disabled by the user)! > > Anyway, there is no reason to limit yourself to the first library. As an > example, the following command will work without any problem: >> attempt <- try(do.call(library, list("nugohs1a520180hsentrezgcdf", >> lib.loc = .libPaths()))) >> attempt > [1] "gcrma" "nugohs1a520180hsentrezgcdf" > [3] "affy" "Biobase" > [5] "stats" "graphics" > [7] "grDevices" "utils" > [9] "datasets" "methods" > [11] "base" > > At least your function will really go through all R library directories to > search whether or not a library is installed! > So I kindly ask for the following modifications: > 1. An option in gcRMA to simply disable the automated installation of > missing libraries [I need to control what happens! :)] > 2. To simply use .libPaths() instead of .libPaths()[1] to really search > through all R installation directories. > > Please let me know whether or not you agree. Doing these 2 modifications > are not very hard, so I can contribute it to you if you are interested. > > Regards, > Philip > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Björn Usadel ▴ 250
@bjorn-usadel-1492
Last seen 10.2 years ago
Dear Guido, just one word of caution, even when you apply transposed median polish there might be minute amounts of correlation left in small sample sizes when you use gcRMA background correction. gcRMA was what actually triggered the studies of Lim et al.,(http://bioinformatics.oxfordjournals.org/content/23/13/i282.long) which we then extended. (Have a look into Additional file 7 for the permutations of bg/norm/sum in http://www.biomedcentral.com/1471-2105/11/553) But having the option to choose would definitely be great. Best Wishes, Bj?rn Hooiveld, Guido wrote: > Dear Jean, > > Please allow me to put forward a feature request for the GCRMA package: > as you may have noticed in literature and on the BioC mailing list, there has been discussion on the use of the median polish algorithm (in RMA) for summerizing signals of probes into a single probeset value in relation to correlation artefacts. See e.g.: > http://www.biomedcentral.com/1471-2105/11/553 > and > http://thread.gmane.org/gmane.science.biology.informatics.conductor/ 32255/focus=32259 > > Moreover, in the above-mentioned BioC thread it is advocated to use a robust regression M-estimation procedure, e.g. available in the package 'affyPLM' / 'preprocessCore', instead of applying median polish on the transposed data matrix (aka tRMA), as was suggested by the authors of before-mentioned paper. > > In the intro of the fRMA paper it is also stated that more statistically rigorous procedures such as M-estimation techniques could be used for summerization, and this is one of the reasons fRMA by default uses AffyPLM's default M-estimator (Huber) for summerization instead of median polish. > http://dx.doi.org/10.1093/biostatistics/kxp059 > > Since AFAIK GCRMA is equal to RMA, except of course for the background correction, I wondered whether it would be possible to build in GCRMA the option to give a user the possibility to select a robust M-estimation procedure (e.g. affyPLM's default one) over the (GCRMA's default) median polish algorithm to summerize the probe data into a probeset values. Thus something like: > x.norm <- gcrma(affy.data, sum="median.polish") [default] or x.norm <- gcrma(affy.data, sum="affyPLM"). > > I would appreciate your opinion on this. > > Regards, > Guido > > > In addition, i would like to remind you about another issue with GCRMA my collegue Philip brought forward last December, which you may have missed (i copied his email below): > -------------------------------------------------------------------- --------- > I noticed the following problem when using gcRMA. the gcRMA-library tries to install probe packages. This is fine, except in cases when a probe-package is already available (and local versions vs repository versions do not necessarily match)! This behaviour is triggered within the function getProbePackage: > > function (probepackage, lib = .libPaths()[1], verbose = TRUE) > { > options(show.error.messages = FALSE) > attempt <- try(do.call(library, list(probepackage, lib.loc = lib))) > options(show.error.messages = TRUE) > > ... > } > > .libPaths() is in this particular example: > >> .libPaths() >> > [1] "/local/home/guidoh/R/x86_64-unknown-linux-gnu-library/2.12" > [2] "/geninf/prog64/R/R-2.12.0/lib64/R/library" > > As you can see, .libPaths()[1] point the the local R directory of the user, whereas the R installation directory is in .libPaths()[2]. Hence, we have the complication that gcRMA installs (the wrong) probe libraries (from BioC) that are already available to the user! The issue with this is that in some cases we use custom, tailored libraries that are not identical to those in the repositories. Hence, we may run into unexpected problems! As a matter of fact, I prefer to simply disable the ability (in gcRMA) to automatically install probe packages in the first place (just an option that is enabled by default, but can be disabled by the user)! > > Anyway, there is no reason to limit yourself to the first library. As an example, the following command will work without any problem: > >> attempt <- try(do.call(library, list("nugohs1a520180hsentrezgcdf", lib.loc = .libPaths()))) >> attempt >> > [1] "gcrma" "nugohs1a520180hsentrezgcdf" > [3] "affy" "Biobase" > [5] "stats" "graphics" > [7] "grDevices" "utils" > [9] "datasets" "methods" > [11] "base" > > At least your function will really go through all R library directories to search whether or not a library is installed! > So I kindly ask for the following modifications: > 1. An option in gcRMA to simply disable the automated installation of missing libraries [I need to control what happens! :)] > 2. To simply use .libPaths() instead of .libPaths()[1] to really search through all R installation directories. > > Please let me know whether or not you agree. Doing these 2 modifications are not very hard, so I can contribute it to you if you are interested. > > Regards, > Philip > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > . > > -- --------------------------------------------- Bj?rn Usadel Max Planck Institute for Molecular Plant Physiology Am M?hlenberg 1 14476 Potsdam, Germany Tel:0331 5678153 www.tinyurl.com/IntegrativeCarbonBiology www.gabipd.org
ADD COMMENT
0
Entering edit mode
Hi Guido and Bj?rn The Lim paper's suggestion has been taken in 2008. Guido's suggestion is certainly useful. I just have not had time to test run with the changes. I will contact Guido and Philip in private since they may be able to contribute Best, Jean On 1/18/2011 3:39 PM, bjoern usadel wrote: > Dear Guido, > > just one word of caution, even when you apply transposed median polish > there might be minute amounts of correlation left in small sample sizes > when you use gcRMA background correction. gcRMA was what actually > triggered the studies of Lim et > al.,(http://bioinformatics.oxfordjournals.org/content/23/13/i282.long) > which we then extended. (Have a look into Additional file 7 for the > permutations of bg/norm/sum in > http://www.biomedcentral.com/1471-2105/11/553) > > But having the option to choose would definitely be great. > > Best Wishes, > Bj?rn > > Hooiveld, Guido wrote: >> Dear Jean, >> >> Please allow me to put forward a feature request for the GCRMA package: >> as you may have noticed in literature and on the BioC mailing list, >> there has been discussion on the use of the median polish algorithm >> (in RMA) for summerizing signals of probes into a single probeset >> value in relation to correlation artefacts. See e.g.: >> http://www.biomedcentral.com/1471-2105/11/553 >> and >> http://thread.gmane.org/gmane.science.biology.informatics.conductor /32255/focus=32259 >> >> >> Moreover, in the above-mentioned BioC thread it is advocated to use a >> robust regression M-estimation procedure, e.g. available in the >> package 'affyPLM' / 'preprocessCore', instead of applying median >> polish on the transposed data matrix (aka tRMA), as was suggested by >> the authors of before-mentioned paper. >> >> In the intro of the fRMA paper it is also stated that more >> statistically rigorous procedures such as M-estimation techniques >> could be used for summerization, and this is one of the reasons fRMA >> by default uses AffyPLM's default M-estimator (Huber) for >> summerization instead of median polish. >> http://dx.doi.org/10.1093/biostatistics/kxp059 >> >> Since AFAIK GCRMA is equal to RMA, except of course for the background >> correction, I wondered whether it would be possible to build in GCRMA >> the option to give a user the possibility to select a robust >> M-estimation procedure (e.g. affyPLM's default one) over the (GCRMA's >> default) median polish algorithm to summerize the probe data into a >> probeset values. Thus something like: >> x.norm <- gcrma(affy.data, sum="median.polish") [default] or x.norm <- >> gcrma(affy.data, sum="affyPLM"). >> >> I would appreciate your opinion on this. >> >> Regards, >> Guido >> >> >> In addition, i would like to remind you about another issue with GCRMA >> my collegue Philip brought forward last December, which you may have >> missed (i copied his email below): >> ------------------------------------------------------------------- ---------- >> >> I noticed the following problem when using gcRMA. the gcRMA-library >> tries to install probe packages. This is fine, except in cases when a >> probe-package is already available (and local versions vs repository >> versions do not necessarily match)! This behaviour is triggered within >> the function getProbePackage: >> >> function (probepackage, lib = .libPaths()[1], verbose = TRUE) >> { >> options(show.error.messages = FALSE) >> attempt <- try(do.call(library, list(probepackage, lib.loc = lib))) >> options(show.error.messages = TRUE) >> >> ... >> } >> >> .libPaths() is in this particular example: >>> .libPaths() >> [1] "/local/home/guidoh/R/x86_64-unknown-linux-gnu-library/2.12" >> [2] "/geninf/prog64/R/R-2.12.0/lib64/R/library" >> >> As you can see, .libPaths()[1] point the the local R directory of the >> user, whereas the R installation directory is in .libPaths()[2]. >> Hence, we have the complication that gcRMA installs (the wrong) probe >> libraries (from BioC) that are already available to the user! The >> issue with this is that in some cases we use custom, tailored >> libraries that are not identical to those in the repositories. Hence, >> we may run into unexpected problems! As a matter of fact, I prefer to >> simply disable the ability (in gcRMA) to automatically install probe >> packages in the first place (just an option that is enabled by >> default, but can be disabled by the user)! >> >> Anyway, there is no reason to limit yourself to the first library. As >> an example, the following command will work without any problem: >>> attempt <- try(do.call(library, list("nugohs1a520180hsentrezgcdf", >>> lib.loc = .libPaths()))) >>> attempt >> [1] "gcrma" "nugohs1a520180hsentrezgcdf" >> [3] "affy" "Biobase" >> [5] "stats" "graphics" >> [7] "grDevices" "utils" >> [9] "datasets" "methods" >> [11] "base" >> >> At least your function will really go through all R library >> directories to search whether or not a library is installed! >> So I kindly ask for the following modifications: >> 1. An option in gcRMA to simply disable the automated installation of >> missing libraries [I need to control what happens! :)] >> 2. To simply use .libPaths() instead of .libPaths()[1] to really >> search through all R installation directories. >> >> Please let me know whether or not you agree. Doing these 2 >> modifications are not very hard, so I can contribute it to you if you >> are interested. >> >> Regards, >> Philip >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> . >> > -- ------------------------------------ Zhijin (Jean) Wu Assistant Professor of Biostatistics Brown University, Box G-S121 Providence, RI 02912 Tel: 401 863 1230 Fax: 401 863 9182 http://www.stat.brown.edu/zwu
ADD REPLY

Login before adding your answer.

Traffic: 738 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6