affyPara: background correction & normalization only?
1
0
Entering edit mode
L L ▴ 30
@l-l-3730
Last seen 7 months ago
Finland
Dear affyPara package maintainers and BioC developer community, The parallelizations of Affy preprocessing in the affyPara package provide essential tools to handle large array collections. Before starting to hack on this myself, I would like to ask if there are workarounds in affyPara to obtain a preprocessed & normalized (but not summarized) PM intensity matrix from CEL files? Alternatively, having an access to virtual affybatch (i.e. keeping it in the nodes without 'rebuild') would solve the problem. Is such functionality available? I would need the probe-level values (for PM probes) after preprocessing and normalization (but without probeset summarization step) - in an ideal case a probes x arrays matrix, and not necessarily other information from the affybatch object. It seems that both background correction and normalization can be done in affyPara by reading in the CEL files directly. However, the output of these methods in itself an affybatch which will cause the memory problems the package is trying to solve. Thanks once more for relevant work. with kind regards Leo Lahti Department of Information and Computer Science Aalto University School of Science and Technology Finland [[alternative HTML version deleted]]
Preprocessing affy affyPara Preprocessing affy affyPara • 1.3k views
ADD COMMENT
0
Entering edit mode
@markus-schmidberger-2240
Last seen 10.2 years ago
Dear Leo, good point. I did some changes to preproPara(). No there is a summarization method 'none' (summary.method='none') available. There will be no summarization step and the results object eset is NULL. I submitted these changes and documentation to the svn. Version 1.7.1. should be available ad midnight: http://www.bioconductor.org/packages/devel/bioc/html/affyPara.html Now at all nodes the bgc and normalized affyBatches are available in the GlobalEnvironment. It is very simple to run some code on that. res <- clusterCall(cluster, FUN) FUN<- function() { require(affy) if (exists("AffyBatch", envir = .GlobalEnv)) AffyBatch <- get("AffyBatch", envir = .GlobalEnv) # do anything you want on the AffyBatch. } res will be a list of results from all nodes. You have to find a way to combine these results. Best Markus Leo Lahti wrote: > Dear affyPara package maintainers and BioC developer community, > > > The parallelizations of Affy preprocessing in the affyPara package > provide essential tools to handle large array collections. > > Before starting to hack on this myself, I would like to ask if there > are workarounds in affyPara to obtain a preprocessed & normalized (but > not summarized) PM intensity matrix from CEL files? Alternatively, > having an access to virtual affybatch (i.e. keeping it in the nodes > without 'rebuild') would solve the problem. Is such functionality > available? > > I would need the probe-level values (for PM probes) after > preprocessing and normalization (but without probeset summarization > step) - in an ideal case a probes x arrays matrix, and not > necessarily other information from the affybatch object. It seems that > both background correction and normalization can be done in affyPara > by reading in the CEL files directly. However, the output of these > methods in itself an affybatch which will cause the memory problems > the package is trying to solve. > > Thanks once more for relevant work. > > with kind regards > > Leo Lahti > Department of Information and Computer Science > Aalto University School of Science and Technology > Finland > -- Dr. rer. nat. Markus Schmidberger Ludwig-Maximilians-Universit?t M?nchen IBE - Institut f?r medizinische Informationsverarbeitung, Biometrie und Epidemiologie Lehrstuhl f?r Biometrie und Bioinformatik Marchioninistr. 15, D-81377 Muenchen URL: http://www.ibe.med.uni-muenchen.de Mail: Markus.Schmidberger [at] ibe.med.uni-muenchen.de
ADD COMMENT
0
Entering edit mode
Dear Markus, Thanks for a quick reply. I tested affyPara 1.7.1. To get the summarization method 'none' to work, I had to slightly modify the summary.method check in preproPara R code ("if ((any( express.summary.stat.methods()==summary.method) || summary.method == "none") == 0) stop("Unknown Summarization-Method")"). After this the summary.method "none" seems to be functional in preproPara function. However, I bumped into another problem which seems to be related to the affyPara warnings in build/check at http://bioconductor.org/checkResults/2.6/bioc-LATEST/ I could not so far solve this issue so far. Any solutions/updates/feedback would be appreciated. You can find the output of my running example below. with kind regards Leo Lahti leo.lahti@iki.fi http://www.cis.hut.fi/lmlahti > require(affyPara) > require(hgu133plus2cdf) Loading required package: hgu133plus2cdf > require(hgu133plus2hsentrezgcdf) Loading required package: hgu133plus2hsentrezgcdf > cl <- makeCluster(3, type = "SOCK") > cels <- list.celfiles("/my/CEL/path", full.names = TRUE) > > eset <- preproPara(cels, + bgcorrect = TRUE, bgcorrect.method = "rma", + normalize = TRUE, normalize.method = "quantiles", + pmcorrect.method = "pmonly", + summary.method = "none", + cdfname = "HGU133Plus2_Hs_ENTREZG", cluster = cl, verbose = TRUE) Partition of object Error in gsub("^/?([^/]*/)*", "", unlist(object), extended = TRUE) : unused argument(s) (extended = TRUE) > sessionInfo() R version 2.11.0 Under development (unstable) (2010-02-15 r51142) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] tcltk stats graphics grDevices utils datasets methods [8] base other attached packages: [1] hgu133plus2hsentrezgcdf_12.1.0 hgu133plus2cdf_2.5.0 [3] affyPara_1.7.1 aplpack_1.2.2 [5] vsn_3.15.1 snow_0.3-3 [7] affy_1.25.2 Biobase_2.7.4 loaded via a namespace (and not attached): [1] affyio_1.15.2 grid_2.11.0 lattice_0.18-3 [4] limma_3.3.9 preprocessCore_1.9.0 tools_2.11.0 On Fri, Mar 5, 2010 at 2:02 PM, Markus Schmidberger < schmidb@ibe.med.uni-muenchen.de> wrote: > Dear Leo, > > good point. > > I did some changes to preproPara(). No there is a summarization method > 'none' (summary.method='none') available. There will be no summarization > step and the results object eset is NULL. > I submitted these changes and documentation to the svn. Version 1.7.1. > should be available ad midnight: > http://www.bioconductor.org/packages/devel/bioc/html/affyPara.html > > Now at all nodes the bgc and normalized affyBatches are available in the > GlobalEnvironment. It is very simple to run some code on that. > > res <- clusterCall(cluster, FUN) > FUN<- function() > { > require(affy) > if (exists("AffyBatch", envir = .GlobalEnv)) > AffyBatch <- get("AffyBatch", envir = .GlobalEnv) > > # do anything you want on the AffyBatch. > } > > res will be a list of results from all nodes. You have to find a way to > combine these results. > > Best > Markus > > > > > Leo Lahti wrote: > >> Dear affyPara package maintainers and BioC developer community, >> >> >> The parallelizations of Affy preprocessing in the affyPara package provide >> essential tools to handle large array collections. >> >> Before starting to hack on this myself, I would like to ask if there are >> workarounds in affyPara to obtain a preprocessed & normalized (but not >> summarized) PM intensity matrix from CEL files? Alternatively, having an >> access to virtual affybatch (i.e. keeping it in the nodes without 'rebuild') >> would solve the problem. Is such functionality available? >> >> I would need the probe-level values (for PM probes) after preprocessing >> and normalization (but without probeset summarization step) - in an ideal >> case a probes x arrays matrix, and not necessarily other information from >> the affybatch object. It seems that both background correction and >> normalization can be done in affyPara by reading in the CEL files directly. >> However, the output of these methods in itself an affybatch which will cause >> the memory problems the package is trying to solve. >> >> Thanks once more for relevant work. >> >> with kind regards >> >> Leo Lahti >> Department of Information and Computer Science >> Aalto University School of Science and Technology >> Finland >> >> > > -- > Dr. rer. nat. Markus Schmidberger > > Ludwig-Maximilians-Universität München > IBE - Institut für medizinische Informationsverarbeitung, > Biometrie und Epidemiologie > Lehrstuhl für Biometrie und Bioinformatik > Marchioninistr. 15, D-81377 Muenchen > URL: http://www.ibe.med.uni-muenchen.de Mail: Markus.Schmidberger [at] > ibe.med.uni-muenchen.de > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Leo Lahti wrote: > Dear Markus, > > Thanks for a quick reply. > > > I tested affyPara 1.7.1. To get the summarization method 'none' to > work, I had to slightly modify the summary.method check in preproPara > R code ("if ((any( express.summary.stat.methods()==summary.method) || > summary.method == "none") == 0) stop("Unknown > Summarization-Method")"). After this the summary.method "none" seems > to be functional in preproPara function. > > > However, I bumped into another problem which seems to be related to > the affyPara warnings in build/check at > http://bioconductor.org/checkResults/2.6/bioc-LATEST/ > I could not so far solve this issue so far. Any > solutions/updates/feedback would be appreciated. Do not use the unstable 2.11 version. Please use version 2.10. I will check for these warnings as soon there is a code freeze for 2.11. I will add the none to the summary.method list. Thanks! Best Markus > > > You can find the output of my running example below. > > > with kind regards > Leo Lahti > leo.lahti at iki.fi <mailto:leo.lahti at="" iki.fi=""> > http://www.cis.hut.fi/lmlahti > > > > require(affyPara) > > require(hgu133plus2cdf) > Loading required package: hgu133plus2cdf > > require(hgu133plus2hsentrezgcdf) > Loading required package: hgu133plus2hsentrezgcdf > > cl <- makeCluster(3, type = "SOCK") > > cels <- list.celfiles("/my/CEL/path", full.names = TRUE) > > > > eset <- preproPara(cels, > + bgcorrect = TRUE, bgcorrect.method = "rma", > + normalize = TRUE, normalize.method = "quantiles", > + pmcorrect.method = "pmonly", > + summary.method = "none", > + cdfname = "HGU133Plus2_Hs_ENTREZG", cluster = cl, verbose > = TRUE) > Partition of object Error in gsub("^/?([^/]*/)*", "", unlist(object), > extended = TRUE) : > unused argument(s) (extended = TRUE) > > sessionInfo() > R version 2.11.0 Under development (unstable) (2010-02-15 r51142) > x86_64-unknown-linux-gnu > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] tcltk stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] hgu133plus2hsentrezgcdf_12.1.0 hgu133plus2cdf_2.5.0 > [3] affyPara_1.7.1 aplpack_1.2.2 > [5] vsn_3.15.1 snow_0.3-3 > [7] affy_1.25.2 Biobase_2.7.4 > > loaded via a namespace (and not attached): > [1] affyio_1.15.2 grid_2.11.0 lattice_0.18-3 > [4] limma_3.3.9 preprocessCore_1.9.0 tools_2.11.0 > > On Fri, Mar 5, 2010 at 2:02 PM, Markus Schmidberger > <schmidb at="" ibe.med.uni-muenchen.de=""> <mailto:schmidb at="" ibe.med.uni-muenchen.de="">> wrote: > > Dear Leo, > > good point. > > I did some changes to preproPara(). No there is a summarization > method 'none' (summary.method='none') available. There will be no > summarization step and the results object eset is NULL. > I submitted these changes and documentation to the svn. Version > 1.7.1. should be available ad midnight: > http://www.bioconductor.org/packages/devel/bioc/html/affyPara.html > > Now at all nodes the bgc and normalized affyBatches are available > in the GlobalEnvironment. It is very simple to run some code on that. > > res <- clusterCall(cluster, FUN) > FUN<- function() > { > require(affy) > if (exists("AffyBatch", envir = .GlobalEnv)) > AffyBatch <- get("AffyBatch", envir = .GlobalEnv) > > # do anything you want on the AffyBatch. > } > > res will be a list of results from all nodes. You have to find a > way to combine these results. > > Best > Markus > > > > > Leo Lahti wrote: > > Dear affyPara package maintainers and BioC developer community, > > > The parallelizations of Affy preprocessing in the affyPara > package provide essential tools to handle large array collections. > > Before starting to hack on this myself, I would like to ask if > there are workarounds in affyPara to obtain a preprocessed & > normalized (but not summarized) PM intensity matrix from CEL > files? Alternatively, having an access to virtual affybatch > (i.e. keeping it in the nodes without 'rebuild') would solve > the problem. Is such functionality available? > > I would need the probe-level values (for PM probes) after > preprocessing and normalization (but without probeset > summarization step) - in an ideal case a probes x arrays > matrix, and not necessarily other information from the > affybatch object. It seems that both background correction and > normalization can be done in affyPara by reading in the CEL > files directly. However, the output of these methods in itself > an affybatch which will cause the memory problems the package > is trying to solve. > > Thanks once more for relevant work. > > with kind regards > > Leo Lahti > Department of Information and Computer Science > Aalto University School of Science and Technology > Finland > > > > -- > Dr. rer. nat. Markus Schmidberger > > Ludwig-Maximilians-Universit?t M?nchen > IBE - Institut f?r medizinische Informationsverarbeitung, > Biometrie und Epidemiologie > Lehrstuhl f?r Biometrie und Bioinformatik > Marchioninistr. 15, D-81377 Muenchen > URL: http://www.ibe.med.uni-muenchen.de Mail: Markus.Schmidberger > [at] ibe.med.uni-muenchen.de <http: ibe.med.uni-muenchen.de=""> > > -- Dr. rer. nat. Markus Schmidberger Ludwig-Maximilians-Universit?t M?nchen IBE - Institut f?r medizinische Informationsverarbeitung, Biometrie und Epidemiologie Lehrstuhl f?r Biometrie und Bioinformatik Marchioninistr. 15, D-81377 Muenchen URL: http://www.ibe.med.uni-muenchen.de Mail: Markus.Schmidberger [at] ibe.med.uni-muenchen.de
ADD REPLY
0
Entering edit mode
fixed everything in 1.7.2. Thanks for using affyPara and providing feedback. Best Markus Leo Lahti wrote: > Dear Markus, > > Thanks for a quick reply. > > > I tested affyPara 1.7.1. To get the summarization method 'none' to > work, I had to slightly modify the summary.method check in preproPara > R code ("if ((any( express.summary.stat.methods()==summary.method) || > summary.method == "none") == 0) stop("Unknown > Summarization-Method")"). After this the summary.method "none" seems > to be functional in preproPara function. > > > However, I bumped into another problem which seems to be related to > the affyPara warnings in build/check at > http://bioconductor.org/checkResults/2.6/bioc-LATEST/ > I could not so far solve this issue so far. Any > solutions/updates/feedback would be appreciated. > > > You can find the output of my running example below. > > > with kind regards > Leo Lahti > leo.lahti at iki.fi <mailto:leo.lahti at="" iki.fi=""> > http://www.cis.hut.fi/lmlahti > > > > require(affyPara) > > require(hgu133plus2cdf) > Loading required package: hgu133plus2cdf > > require(hgu133plus2hsentrezgcdf) > Loading required package: hgu133plus2hsentrezgcdf > > cl <- makeCluster(3, type = "SOCK") > > cels <- list.celfiles("/my/CEL/path", full.names = TRUE) > > > > eset <- preproPara(cels, > + bgcorrect = TRUE, bgcorrect.method = "rma", > + normalize = TRUE, normalize.method = "quantiles", > + pmcorrect.method = "pmonly", > + summary.method = "none", > + cdfname = "HGU133Plus2_Hs_ENTREZG", cluster = cl, verbose > = TRUE) > Partition of object Error in gsub("^/?([^/]*/)*", "", unlist(object), > extended = TRUE) : > unused argument(s) (extended = TRUE) > > sessionInfo() > R version 2.11.0 Under development (unstable) (2010-02-15 r51142) > x86_64-unknown-linux-gnu > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] tcltk stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] hgu133plus2hsentrezgcdf_12.1.0 hgu133plus2cdf_2.5.0 > [3] affyPara_1.7.1 aplpack_1.2.2 > [5] vsn_3.15.1 snow_0.3-3 > [7] affy_1.25.2 Biobase_2.7.4 > > loaded via a namespace (and not attached): > [1] affyio_1.15.2 grid_2.11.0 lattice_0.18-3 > [4] limma_3.3.9 preprocessCore_1.9.0 tools_2.11.0 > > On Fri, Mar 5, 2010 at 2:02 PM, Markus Schmidberger > <schmidb at="" ibe.med.uni-muenchen.de=""> <mailto:schmidb at="" ibe.med.uni-muenchen.de="">> wrote: > > Dear Leo, > > good point. > > I did some changes to preproPara(). No there is a summarization > method 'none' (summary.method='none') available. There will be no > summarization step and the results object eset is NULL. > I submitted these changes and documentation to the svn. Version > 1.7.1. should be available ad midnight: > http://www.bioconductor.org/packages/devel/bioc/html/affyPara.html > > Now at all nodes the bgc and normalized affyBatches are available > in the GlobalEnvironment. It is very simple to run some code on that. > > res <- clusterCall(cluster, FUN) > FUN<- function() > { > require(affy) > if (exists("AffyBatch", envir = .GlobalEnv)) > AffyBatch <- get("AffyBatch", envir = .GlobalEnv) > > # do anything you want on the AffyBatch. > } > > res will be a list of results from all nodes. You have to find a > way to combine these results. > > Best > Markus > > > > > Leo Lahti wrote: > > Dear affyPara package maintainers and BioC developer community, > > > The parallelizations of Affy preprocessing in the affyPara > package provide essential tools to handle large array collections. > > Before starting to hack on this myself, I would like to ask if > there are workarounds in affyPara to obtain a preprocessed & > normalized (but not summarized) PM intensity matrix from CEL > files? Alternatively, having an access to virtual affybatch > (i.e. keeping it in the nodes without 'rebuild') would solve > the problem. Is such functionality available? > > I would need the probe-level values (for PM probes) after > preprocessing and normalization (but without probeset > summarization step) - in an ideal case a probes x arrays > matrix, and not necessarily other information from the > affybatch object. It seems that both background correction and > normalization can be done in affyPara by reading in the CEL > files directly. However, the output of these methods in itself > an affybatch which will cause the memory problems the package > is trying to solve. > > Thanks once more for relevant work. > > with kind regards > > Leo Lahti > Department of Information and Computer Science > Aalto University School of Science and Technology > Finland > > > > -- > Dr. rer. nat. Markus Schmidberger > > Ludwig-Maximilians-Universit?t M?nchen > IBE - Institut f?r medizinische Informationsverarbeitung, > Biometrie und Epidemiologie > Lehrstuhl f?r Biometrie und Bioinformatik > Marchioninistr. 15, D-81377 Muenchen > URL: http://www.ibe.med.uni-muenchen.de Mail: Markus.Schmidberger > [at] ibe.med.uni-muenchen.de <http: ibe.med.uni-muenchen.de=""> > > -- Dr. rer. nat. Markus Schmidberger Ludwig-Maximilians-Universit?t M?nchen IBE - Institut f?r medizinische Informationsverarbeitung, Biometrie und Epidemiologie Lehrstuhl f?r Biometrie und Bioinformatik Marchioninistr. 15, D-81377 Muenchen URL: http://www.ibe.med.uni-muenchen.de Mail: Markus.Schmidberger [at] ibe.med.uni-muenchen.de
ADD REPLY

Login before adding your answer.

Traffic: 474 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6