xps - error in normalize - "Error: Length of non-varying units is zero."

0

Entering edit mode

Matthew Thornton ▴ 390

@matthew-thornton-5564

Last seen 3 days ago

USA, Los Angeles, USC

Hello! I am trying to optimize my data processing with xps. I am getting an error when using the normalize function. It could be due to improper switches. here is the error: > data_norm <- normalize(data_bkgrd, "Normalize_Step2", filedir=outdir, tmpdir="", update = FALSE, select = "pmonly", exonlevel="all", method="mean", option = "transcript:all", logbase = "0", refindex = 0, refmethod = "mean", params = list(0.02, 0), verbose=TRUE) Opening file </data> in <read> mode... Creating new file </data>... Opening file </data> in <read> mode... Preprocessing data using method <preprocess>... Normalizing raw data... normalizing data using method <mean>... filling array <reference>... normalizing <ctr1_mix2_25apr14.int>... setting selector mask for typepm <16316> normalization <mean>: Scaling factor SF is <0.859736> normalizing <ctr2_mix2_25apr14.int>... setting selector mask for typepm <16316> Error: Length of non-varying units is zero. An error has occured: Need to abort current process. Error in .local(object, ...) : error in rwrapper function ?Normalize? Here are the lines in my Rscript for piecewise processing. I am using the default settings but it would be nice to know more about how to optimize them. # Background correct data_bkgrd <- bgcorrect(data_raw, "Background_Step1", filedir=outdir, tmpdir="", method="sector", select="pmonly", option="correctbg", params=c(0.02, 4, 4, 0), exonlevel="all", verbose=TRUE) png(file="Background_Correction_Density_Plot.png", width=600, height=600) par(mar=c(6,3,1,1)); hist(data_bkgrd, add.legend=TRUE) dev.off() # Normalization data_norm <- normalize(data_bkgrd, "Normalize_Step2", filedir=outdir, tmpdir="", update = FALSE, select = "all", exonlevel="all", method="mean", option = "transcript:all", logbase = "0", refindex = 0, refmethod = "mean", params = list(0.02, 0), verbose=TRUE) png(file="Normalization_Density_Plot.png", width=600, height=600) par(mar=c(6,3,1,1)); hist(data_norm, add.legend=TRUE) dev.off() # Summarization data_sum <- summarize(data_norm, "Summary_Step3", filedir=outdir, tmpdir="", update = FALSE, select="pmonly", method = "medianpolish", option = "transcript", exonlevel="core+affx", verbose=TRUE) png(file="Summary_Density_Plot.png", width=600, height=600) par(mar=c(6,3,1,1)); hist(data_sum, add.legend=TRUE) dev.off() Any comments or advice are greatly appreciated! Thanks! Matt matthew.thornton at med.usc.edu

Normalization PROcess xps Normalization PROcess xps • 1.7k views

ADD COMMENT • link updated 11.5 years ago by cstrato ★ 3.9k • written 11.6 years ago by Matthew Thornton ▴ 390

0

Entering edit mode

cstrato ★ 3.9k

@cstrato-908

Last seen 7.3 years ago

Austria

Dear Matt, When you try to do piecewise processing, which does not reflect the usual rma() or mas5() steps, then it is important to read vignette 'xpsPreprocess.pdf'. Even then it depends on the type of array which option(s) you can use for which function. It is important to check the verbose output and to do some quality control for each step. For example your script for background correction, i.e.: > # Background correct > data_bg_pm <- bgcorrect(data.genome, "Background_Step1", filedir=outdir, tmpdir="", > method="sector", select="pmonly", option="correctbg", params=c(0.02, 4, 4, 0), > exonlevel="all", verbose=TRUE) results in the following output (where I show only part of the most important output): > background statistics: > 2598544 cells with minimal intensity 0 > 2598544 cells with maximal intensity 0 This means that no background was subtracted. To create an image for the background would have helped. In this case you need to do: > data_bkgrd <- bgcorrect(data.genome, "Background_All", filedir=outdir, tmpdir="", > method="sector", select="all", option="correctbg", params=c(0.02, 4, 4, 0), > exonlevel="all", verbose=TRUE) Now you get the following output: > background statistics: > 162409 cells with minimal intensity 25.1364 > 162409 cells with maximal intensity 26.4606 Regarding the 'Probe-level Normalization' step, I am currently not sure what the reason for the error that you get is. It does work for ivt-arrays, and 'quantile' normalization also works for whole genome arrays. I have just tested this again. However, for 'mean' normalization I can reproduce your error. I have to investigate, it may simply be a problem of finding the right parameters. If you skip this step then you can do the Summarization as follows: > # Summarization > data_sum <- summarize(data_bkgrd, "Summary_Step3", filedir=outdir, tmpdir="", update = FALSE, > select="pmonly", method = "medianpolish", option = "transcript", > logbase="log2", params=c(10, 0.01, 1.0), exonlevel="core+affx", verbose=TRUE) As a last step you could even do 'Probeset-level Normalization': > data_norm <- normalize(data_sum, "Sum_Norm_Mean", filedir=outdir, tmpdir="", update = FALSE, > select = "separate", exonlevel="core+affx", method="mean", option = "transcript:all", > logbase = "0", refindex = 0, refmethod = "mean", params = list(0.02, 0), verbose=TRUE) Best regards, Christian On 7/22/14 12:23 AM, Thornton, Matthew wrote: > Hello! > > I am trying to optimize my data processing with xps. I am getting an error when using the normalize function. It could be due to improper switches. > > here is the error: > >> data_norm <- normalize(data_bkgrd, "Normalize_Step2", filedir=outdir, tmpdir="", update = FALSE, select = "pmonly", exonlevel="all", method="mean", option = "transcript:all", logbase = "0", refindex = 0, refmethod = "mean", params = list(0.02, 0), verbose=TRUE) > Opening file </data> in <read> mode... > Creating new file </data>... > Opening file </data> in <read> mode... > Preprocessing data using method <preprocess>... > Normalizing raw data... > normalizing data using method <mean>... > filling array <reference>... > normalizing <ctr1_mix2_25apr14.int>... > setting selector mask for typepm <16316> > normalization <mean>: Scaling factor SF is <0.859736> > normalizing <ctr2_mix2_25apr14.int>... > setting selector mask for typepm <16316> > Error: Length of non-varying units is zero. > An error has occured: Need to abort current process. > Error in .local(object, ...) : error in rwrapper function ?Normalize? > > Here are the lines in my Rscript for piecewise processing. I am using the default settings but it would be nice to know more about how to optimize them. > > # Background correct > data_bkgrd <- bgcorrect(data_raw, "Background_Step1", filedir=outdir, tmpdir="", method="sector", select="pmonly", option="correctbg", params=c(0.02, 4, 4, 0), exonlevel="all", verbose=TRUE) > > png(file="Background_Correction_Density_Plot.png", width=600, height=600) > par(mar=c(6,3,1,1)); > hist(data_bkgrd, add.legend=TRUE) > dev.off() > > # Normalization > data_norm <- normalize(data_bkgrd, "Normalize_Step2", filedir=outdir, tmpdir="", update = FALSE, select = "all", exonlevel="all", method="mean", option = "transcript:all", logbase = "0", refindex = 0, refmethod = "mean", params = list(0.02, 0), verbose=TRUE) > > png(file="Normalization_Density_Plot.png", width=600, height=600) > par(mar=c(6,3,1,1)); > hist(data_norm, add.legend=TRUE) > dev.off() > > # Summarization > data_sum <- summarize(data_norm, "Summary_Step3", filedir=outdir, tmpdir="", update = FALSE, select="pmonly", method = "medianpolish", option = "transcript", exonlevel="core+affx", verbose=TRUE) > > png(file="Summary_Density_Plot.png", width=600, height=600) > par(mar=c(6,3,1,1)); > hist(data_sum, add.legend=TRUE) > dev.off() > > Any comments or advice are greatly appreciated! > > Thanks! > > Matt > > > matthew.thornton at med.usc.edu > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD COMMENT • link 11.6 years ago cstrato ★ 3.9k

0

Entering edit mode

cstrato ★ 3.9k

@cstrato-908

Last seen 7.3 years ago

Austria

Dear Matt, Meanwhile I have investigated your problem with the 'Probe-level Normalization' step, and there was sadly a minor bug, which did allow to scale only the first array, and then caused an error. Interestingly, only whole genome arrays were affected but not ivt-arrays. Furthermore, mas5() was also not affected since it does not include this step. I have just uploaded version xps-1.24.1 to the BioC servers, so that you should be able to download the new version on Monday. Here is some demo-code using the steps you wanted to do (using GSE46976): ### new R session: load library xps library(xps) ### load ROOT scheme file and ROOT data file scmdir <- "/Volumes/GigaDrive/CRAN/Workspaces/Schemes/na33" scheme.genome <- root.scheme(file.path(scmdir, "mogene20stv1.root")) datdir <- "/Volumes/GigaDrive/CRAN/Workspaces/ROOTData" data.genome <- root.data(scheme.genome, paste(datdir,"GSE46976_cel.root",sep="/")) outdir <- getwd() ## Background correct data_bg_all <- bgcorrect(data.genome, "tmp_Bgrd_All", filedir=outdir, tmpdir="", method="sector", select="all", option="correctbg", params=c(0.02, 4, 4, 0), exonlevel="all", verbose=TRUE) ## boxplot w/o need to attach data boxplot(data_bg_all, which="userinfo:fIntenQuant") ## get colnames of bgrd trees bgrdnames <- colnames(validBgrd(data_bg_all)) bgrdnames ## images for bgrd image(data_bg_all, bg=TRUE, transfo=NULL, col=heat.colors(12), names=paste(namePart(bgrdnames[1]),"sbg",sep=".")) image(data_bg_all, bg=TRUE, transfo=log2, col=heat.colors(12), names=paste(namePart(bgrdnames[1]),"sbg",sep=".")) ## attach mask, bgrd and background-corrected intensities data_bg_all <- attachMask(data_bg_all) data_bg_all <- attachBgrd(data_bg_all) data_bg_all <- attachInten(data_bg_all) # density and boxplot hist(data_bg_all) boxplot(data_bg_all) ## to avoid memory comsumption of R remove data: data_bg_all <- removeMask(data_bg_all) data_bg_all <- removeBgrd(data_bg_all) data_bg_all <- removeInten(data_bg_all) gc() ## Normalization data_norm <- normalize(data_bg_all, "tmp_Norm_All_mn", filedir=outdir, tmpdir="", update = FALSE, select = "all", exonlevel="all", method="mean", option = "transcript:all", logbase = "0", refindex = 0, refmethod = "mean", params = list(0.02, 0), verbose=TRUE) ## attach mask and normalized intensities data_norm <- attachMask(data_norm) data_norm <- attachInten(data_norm) # boxplot and density plot boxplot(data_norm) hist(data_norm) ## remove data_norm <- removeMask(data_norm) data_norm <- removeInten(data_norm) # Summarization data_sum <- summarize(data_norm, "tmp_Sum_Norm_mn", filedir=outdir, tmpdir="", update = FALSE, select="pmonly", method = "medianpolish", option = "transcript", logbase="log2", params=c(10, 0.01, 1.0), exonlevel="core+affx", verbose=TRUE) ## boxplot and density plot boxplot(data_sum) hist(data_sum) ## get expression data x<-validData(data_sum) head(x) ## scatter plot plot(log2(x[,2]), log2(x[,3])) Best regards, Christian On 7/22/14 12:23 AM, Thornton, Matthew wrote: > Hello! > > I am trying to optimize my data processing with xps. I am getting an error when using the normalize function. It could be due to improper switches. > > here is the error: > >> data_norm <- normalize(data_bkgrd, "Normalize_Step2", filedir=outdir, tmpdir="", update = FALSE, select = "pmonly", exonlevel="all", method="mean", option = "transcript:all", logbase = "0", refindex = 0, refmethod = "mean", params = list(0.02, 0), verbose=TRUE) > Opening file </data> in <read> mode... > Creating new file </data>... > Opening file </data> in <read> mode... > Preprocessing data using method <preprocess>... > Normalizing raw data... > normalizing data using method <mean>... > filling array <reference>... > normalizing <ctr1_mix2_25apr14.int>... > setting selector mask for typepm <16316> > normalization <mean>: Scaling factor SF is <0.859736> > normalizing <ctr2_mix2_25apr14.int>... > setting selector mask for typepm <16316> > Error: Length of non-varying units is zero. > An error has occured: Need to abort current process. > Error in .local(object, ...) : error in rwrapper function ?Normalize? > > Here are the lines in my Rscript for piecewise processing. I am using the default settings but it would be nice to know more about how to optimize them. > > # Background correct > data_bkgrd <- bgcorrect(data_raw, "Background_Step1", filedir=outdir, tmpdir="", method="sector", select="pmonly", option="correctbg", params=c(0.02, 4, 4, 0), exonlevel="all", verbose=TRUE) > > png(file="Background_Correction_Density_Plot.png", width=600, height=600) > par(mar=c(6,3,1,1)); > hist(data_bkgrd, add.legend=TRUE) > dev.off() > > # Normalization > data_norm <- normalize(data_bkgrd, "Normalize_Step2", filedir=outdir, tmpdir="", update = FALSE, select = "all", exonlevel="all", method="mean", option = "transcript:all", logbase = "0", refindex = 0, refmethod = "mean", params = list(0.02, 0), verbose=TRUE) > > png(file="Normalization_Density_Plot.png", width=600, height=600) > par(mar=c(6,3,1,1)); > hist(data_norm, add.legend=TRUE) > dev.off() > > # Summarization > data_sum <- summarize(data_norm, "Summary_Step3", filedir=outdir, tmpdir="", update = FALSE, select="pmonly", method = "medianpolish", option = "transcript", exonlevel="core+affx", verbose=TRUE) > > png(file="Summary_Density_Plot.png", width=600, height=600) > par(mar=c(6,3,1,1)); > hist(data_sum, add.legend=TRUE) > dev.off() > > Any comments or advice are greatly appreciated! > > Thanks! > > Matt > > > matthew.thornton at med.usc.edu > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD COMMENT • link 11.5 years ago cstrato ★ 3.9k

Login before adding your answer.