problem with createBeadSummaryData

0

Entering edit mode

Groot, Philip de ▴ 630

@groot-philip-de-1307

Last seen 9.7 years ago

Hello all, I encountered a problem with the createBeadSummaryData function in the beadarray library. I use the Illumina example files that are available on: http://www.com pbio.group.cam.ac.uk/Resources/illumina/BeadLevelExample.zip I use the readIllimuna() command to read the files into Bioconductor: readIllumina(arrayNames = Illumina_Files_Listed, textType=".csv", useImages=F, backgroundMethod="none", singleChannel=TRUE) When I try to summarize the BeadLevel object, this works: x.norm <- createBeadSummaryData(x.illumina, log=FALSE, what="G", method="illumina") However, I obtain weird exprs-values: > exprs(x)[1:5,] 1475542113_A_1 1475542113_A_2 1475542113_B_1 1475542113_B_2 50008 821.4375 NaN 854.6316 NaN 50014 4154.5500 NaN 3782.0000 NaN 50017 21880.9574 NaN 21404.0909 NaN 50020 110.8542 NaN 116.9821 NaN 50022 129.4857 NaN 107.9600 NaN every 2nd column is completely NaN although I correctly state that only the green channels should be used. Is this a bug in the createBeadSummaryData function? If I do the summarization array by array, it does make more sense: x.norm.2 <- createBeadSummaryData(x.illumina, log=FALSE, what="G", method="illumina", arrays=2) > exprs(x.norm.2)[1:5] [1] 869.3913 112.5909 104.4375 116.5116 125.0000 > arrayNames(x.illumina)[2] [1] "1475542113_A_2" So: what causes this particular behaviour? Do I misunderstand something? What is the best way to get the summarization properly done? My goal is to create an ExpressionSetIllumina object on which I can perform normalization (quantile, vsn, etc.). > sessionInfo() R version 2.8.0 (2008-10-20) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US .UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_N AME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTI FICATION=C attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] vsn_3.8.0 affy_1.20.0 beadarray_1.10.0 [4] sma_0.5.15 hwriter_1.0 geneplotter_1.20.0 [7] annotate_1.20.1 xtable_1.5-4 AnnotationDbi_1.4.2 [10] lattice_0.17-20 Biobase_2.2.1 limma_2.16.3 loaded via a namespace (and not attached): [1] affyio_1.10.1 DBI_0.2-4 grid_2.8.0 [4] KernSmooth_2.22-22 marray_1.20.0 preprocessCore_1.4.0 [7] RColorBrewer_1.0-2 RSQLite_0.7-1 tcltk_2.8.0 > Regards, Dr. Philip de Groot Ph.D. Bioinformatics Researcher Wageningen University / TIFN Nutrigenomics Consortium Nutrition, Metabolism & Genomics Group Division of Human Nutrition PO Box 8129, 6700 EV Wageningen Visiting Address: Erfelijkheidsleer: De Valk, Building 304 Dreijenweg 2, 6703 HA Wageningen Room: 0052a T: +31-317-485786 F: +31-317-483342 E-mail: Philip.deGroot at wur.nl <mailto:philip.degroot at="" wur.nl=""> Internet: http://www.nutrigenomicsconsortium.nl <http: www.nutrigenomicsconsortium.nl=""/> http://humannutrition.wur.nl <http: humannutrition.wur.nl=""/> https://madmax.bioinformatics.nl <https: madmax.bioinformatics.nl=""/>

Normalization vsn Normalization vsn • 988 views

ADD COMMENT • link updated 15.2 years ago by Mark Dunning ★ 1.1k • written 15.2 years ago by Groot, Philip de ▴ 630

0

Entering edit mode

Mark Dunning ★ 1.1k

@mark-dunning-3319

Last seen 14 months ago

Sheffield, Uk

Hi Philip, The key to this problem is to set the 'imagesPerArray' argument. This dataset is quite an old one ("Humanv1") and back then Illumina used to put different probes on the two stripes (e.g A_1 and A_2) for each sample. Running createBeadSummaryData with the default option of imagesPerArray =1 will confuse beadarray in this case as it will expect to find the same probes on A_1 and A_2, and will produce NA values when it doesn't. The following command should be used for this dataset:- x.norm <- createBeadSummaryData(x.illumina, log=FALSE, what="G", method="illumina",imagesPerArray=2) More modern Illumina data (v2 and v3) have the same set of probes on both stripes so the default arguments should work. Regards, Mark On Thu, Mar 5, 2009 at 3:04 PM, Groot, Philip de <philip.degroot at="" wur.nl=""> wrote: > Hello all, > > I encountered a problem with the createBeadSummaryData function in the beadarray library. > > I use the Illumina example files that are available on: http://www.c ompbio.group.cam.ac.uk/Resources/illumina/BeadLevelExample.zip > > I use the readIllimuna() command to read the files into Bioconductor: > > readIllumina(arrayNames = Illumina_Files_Listed, textType=".csv", useImages=F, backgroundMethod="none", singleChannel=TRUE) > > When I try to summarize the BeadLevel object, this works: > > x.norm <- createBeadSummaryData(x.illumina, log=FALSE, what="G", method="illumina") > > However, I obtain weird exprs-values: > >> exprs(x)[1:5,] > ? ? ?1475542113_A_1 1475542113_A_2 1475542113_B_1 1475542113_B_2 > 50008 ? ? ? 821.4375 ? ? ? ? ? ?NaN ? ? ? 854.6316 ? ? ? ? ? ?NaN > 50014 ? ? ?4154.5500 ? ? ? ? ? ?NaN ? ? ?3782.0000 ? ? ? ? ? ?NaN > 50017 ? ? 21880.9574 ? ? ? ? ? ?NaN ? ? 21404.0909 ? ? ? ? ? ?NaN > 50020 ? ? ? 110.8542 ? ? ? ? ? ?NaN ? ? ? 116.9821 ? ? ? ? ? ?NaN > 50022 ? ? ? 129.4857 ? ? ? ? ? ?NaN ? ? ? 107.9600 ? ? ? ? ? ?NaN > > > every 2nd column is completely NaN although I correctly state that only the green channels should be used. Is this a bug in the createBeadSummaryData function? > > If I do the summarization array by array, it does make more sense: > > x.norm.2 <- createBeadSummaryData(x.illumina, log=FALSE, what="G", method="illumina", arrays=2) > >> exprs(x.norm.2)[1:5] > [1] 869.3913 112.5909 104.4375 116.5116 125.0000 >> arrayNames(x.illumina)[2] > [1] "1475542113_A_2" > > > So: what causes this particular behaviour? Do I misunderstand something? What is the best way to get the summarization properly done? My goal is to create an ExpressionSetIllumina object on which I can perform normalization (quantile, vsn, etc.). > >> sessionInfo() > R version 2.8.0 (2008-10-20) > x86_64-unknown-linux-gnu > > locale: > LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_ US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC _NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDEN TIFICATION=C > > attached base packages: > [1] tools ? ? stats ? ? graphics ?grDevices utils ? ? datasets ?methods > [8] base > > other attached packages: > ?[1] vsn_3.8.0 ? ? ? ? ? affy_1.20.0 ? ? ? ? beadarray_1.10.0 > ?[4] sma_0.5.15 ? ? ? ? ?hwriter_1.0 ? ? ? ? geneplotter_1.20.0 > ?[7] annotate_1.20.1 ? ? xtable_1.5-4 ? ? ? ?AnnotationDbi_1.4.2 > [10] lattice_0.17-20 ? ? Biobase_2.2.1 ? ? ? limma_2.16.3 > > loaded via a namespace (and not attached): > [1] affyio_1.10.1 ? ? ? ?DBI_0.2-4 ? ? ? ? ? ?grid_2.8.0 > [4] KernSmooth_2.22-22 ? marray_1.20.0 ? ? ? ?preprocessCore_1.4.0 > [7] RColorBrewer_1.0-2 ? RSQLite_0.7-1 ? ? ? ?tcltk_2.8.0 >> > > Regards, > > Dr. Philip de Groot Ph.D. > Bioinformatics Researcher > > > Wageningen University / TIFN > Nutrigenomics Consortium > Nutrition, Metabolism & Genomics Group > Division of Human Nutrition > PO Box 8129, 6700 EV Wageningen > Visiting Address: Erfelijkheidsleer: De Valk, Building 304 > Dreijenweg 2, 6703 HA ?Wageningen > Room: 0052a > T: +31-317-485786 > F: +31-317-483342 > E-mail: ? Philip.deGroot at wur.nl <mailto:philip.degroot at="" wur.nl=""> > Internet: http://www.nutrigenomicsconsortium.nl <http: www.nutrigenomicsconsortium.nl=""/> > ? ? ? ? ? ? http://humannutrition.wur.nl <http: humannutrition.wur.nl=""/> > ? ? ? ? ? ? https://madmax.bioinformatics.nl <https: madmax.bioinformatics.nl=""/> > > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD COMMENT • link 15.2 years ago Mark Dunning ★ 1.1k

Login before adding your answer.