Discrepancy on results from gcrma function and justGCRMA
1
0
Entering edit mode
Jenny Drnevich ★ 2.0k
@jenny-drnevich-2812
Last seen 6 weeks ago
United States
Hi everyone, I just spent several hours tracking down this problem, as I noticed this same discrepancy between the results of justGCRMA() and gcrma() called on the same data. There was also another post about this on Nov 4, 2008, but I couldn't find where either of them had ever been answered. HOWEVER, it appears that the discrepancy has been fixed in R 2.10.1 (gcrma 2.18.1), but it was in R 2.10.0 (gcrma 2.18.0) (examples and sessionInfos below). While I'm glad it has been "fixed", where would this have been documented? I didn't think this type of change would have been included in a minor version upgrade. What was the explanation for the original discrepancy? Thanks, Jenny #The examples below are not reproducible because you need .CEL files - run them on any that you have. > library(gcrma) Loading required package: affy Loading required package: Biobase Welcome to Bioconductor Vignettes contain introductory material. To view, type 'openVignette()'. To cite Bioconductor, see 'citation("Biobase")' and for packages 'citation(pkgname)'. > setwd("K:/Bulla/CELfiles") > raw <- ReadAffy() > raw AffyBatch object size of arrays=834x834 features (10 kb) cdf=Rat230_2 (31099 affyids) number of samples=6 number of genes=31099 annotation=rat2302 notes= > > gcrma.1 <- gcrma(raw) Adjusting for optical effect......Done. Computing affinitiesLoading required package: AnnotationDbi .Done. Adjusting for non-specific binding......Done. Normalizing Calculating Expression > > gcrma.2 <- justGCRMA() Computing affinities.Done. Adjusting for optical effect.......Done. Adjusting for non-specific binding......Done. Normalizing Calculating Expression > > all.equal(exprs(gcrma.1),exprs(gcrma.2)) [1] "Mean relative difference: 0.03514035" > > sessionInfo() R version 2.10.0 (2009-10-26) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rat2302probe_2.5.0 AnnotationDbi_1.8.1 rat2302cdf_2.5.0 gcrma_2.18.0 [5] affy_1.24.2 Biobase_2.6.0 loaded via a namespace (and not attached): [1] affyio_1.14.0 Biostrings_2.14.3 DBI_0.2-4 IRanges_1.4.4 [5] preprocessCore_1.8.0 RSQLite_0.7-3 splines_2.10.0 tools_2.10.0 > #Now switch R versions but run same code.... > library(gcrma) Loading required package: affy Loading required package: Biobase Welcome to Bioconductor Vignettes contain introductory material. To view, type 'openVignette()'. To cite Bioconductor, see 'citation("Biobase")' and for packages 'citation(pkgname)'. > setwd("K:/Bulla/CELfiles") > raw <- ReadAffy() > raw AffyBatch object size of arrays=834x834 features (10 kb) cdf=Rat230_2 (31099 affyids) number of samples=6 number of genes=31099 annotation=rat2302 notes= > > gcrma.1 <- gcrma(raw) Adjusting for optical effect......Done. Computing affinitiesLoading required package: AnnotationDbi .Done. Adjusting for non-specific binding......Done. Normalizing Calculating Expression > > gcrma.2 <- justGCRMA() Computing affinities.Done. Adjusting for optical effect.......Done. Adjusting for non-specific binding......Done. Normalizing Calculating Expression > > all.equal(exprs(gcrma.1),exprs(gcrma.2)) [1] TRUE > > sessionInfo() R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rat2302probe_2.5.0 AnnotationDbi_1.8.1 rat2302cdf_2.5.0 gcrma_2.18.1 [5] affy_1.24.2 Biobase_2.6.1 loaded via a namespace (and not attached): [1] affyio_1.14.0 Biostrings_2.14.10 DBI_0.2-5 IRanges_1.4.9 [5] preprocessCore_1.8.0 RSQLite_0.8-0 splines_2.10.1 tools_2.10.1 > At 04:31 PM 10/28/2009, Jin wrote: >Jerry <norn2k at="" ...=""> writes: > > Hi, > ? > I'm >currently using gcrma package in R to summarize >probeset intensities from CEL files. >Surprisingly, > I? found that the results >generated from gcrma function and justGCRMA >function are quite different. In > general,? >expression values from gcrma function? are lower >and? the boxplots from gcrma are? quite > >asymmetric (no tails? on the bottom).? I >attached some plots below for your information. >This confused > me as I thought that the two >functions implemented similar algorithms.? > >? > Thank you so much for your help! > ? > >Best, > Jianjun > > PS: The package I used is >gcrma 2.12.1. I also observed similar results >on? 2.14 version. > > Hello, I'm running >into the same problem with gcrma and justGCRMA. >They are not giving the same numeric results. Is >this an intended feature? I was under the >impression that these two methods were identical >with the exception that justGCRMA was more >memory efficient because it didn't have to >create an AffyBatch object. Can anyone shed some >light on this subject? Thanks, Jin >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at illinois.edu
gcrma GLAD gcrma GLAD • 1.1k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 3 days ago
United States
Hi Jenny, Jenny Drnevich wrote: > Hi everyone, > > I just spent several hours tracking down this problem, as I noticed this > same discrepancy between the results of justGCRMA() and gcrma() called > on the same data. There was also another post about this on Nov 4, 2008, > but I couldn't find where either of them had ever been answered. I think I missed that one, but the question was asked again and answered by me late last year. > HOWEVER, it appears that the discrepancy has been fixed in R 2.10.1 > (gcrma 2.18.1), but it was in R 2.10.0 (gcrma 2.18.0) (examples and > sessionInfos below). > > While I'm glad it has been "fixed", where would this have been > documented? I didn't think this type of change would have been included > in a minor version upgrade. What was the explanation for the original > discrepancy? The explanation is that justGCRMA() is a function that I wrote several years ago. In the intervening period the defaults for gcrma() were changed without making the corresponding changes to justGCRMA(). I made the (unfounded) assumption that the maintainer of the gcrma package (Jean Wu) was keeping justGCRMA() consistent with gcrma(), which as you have seen is not true. Since this was an oversight by both me and Jean, it wasn't documented anywhere. I have since made the two functions consistent when running under the default arguments. However, there are some arguments for gcrma() that are not available for justGCRMA(), so they are not consistent in all cases. Best, Jim > > Thanks, > Jenny > > #The examples below are not reproducible because you need .CEL files - > run them on any that you have. > > > library(gcrma) > Loading required package: affy > Loading required package: Biobase > > Welcome to Bioconductor > > Vignettes contain introductory material. To view, type > 'openVignette()'. To cite Bioconductor, see > 'citation("Biobase")' and for packages 'citation(pkgname)'. > > > setwd("K:/Bulla/CELfiles") > > raw <- ReadAffy() > > raw > AffyBatch object > size of arrays=834x834 features (10 kb) > cdf=Rat230_2 (31099 affyids) > number of samples=6 > number of genes=31099 > annotation=rat2302 > notes= > > > > gcrma.1 <- gcrma(raw) > Adjusting for optical effect......Done. > Computing affinitiesLoading required package: AnnotationDbi > .Done. > Adjusting for non-specific binding......Done. > Normalizing > Calculating Expression > > > > gcrma.2 <- justGCRMA() > Computing affinities.Done. > Adjusting for optical effect.......Done. > Adjusting for non-specific binding......Done. > Normalizing > Calculating Expression > > > > all.equal(exprs(gcrma.1),exprs(gcrma.2)) > [1] "Mean relative difference: 0.03514035" > > > > sessionInfo() > R version 2.10.0 (2009-10-26) > i386-pc-mingw32 > > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United > States.1252 > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] rat2302probe_2.5.0 AnnotationDbi_1.8.1 rat2302cdf_2.5.0 > gcrma_2.18.0 > [5] affy_1.24.2 Biobase_2.6.0 > > loaded via a namespace (and not attached): > [1] affyio_1.14.0 Biostrings_2.14.3 DBI_0.2-4 > IRanges_1.4.4 > [5] preprocessCore_1.8.0 RSQLite_0.7-3 splines_2.10.0 > tools_2.10.0 > > > > #Now switch R versions but run same code.... > > > library(gcrma) > Loading required package: affy > Loading required package: Biobase > > Welcome to Bioconductor > > Vignettes contain introductory material. To view, type > 'openVignette()'. To cite Bioconductor, see > 'citation("Biobase")' and for packages 'citation(pkgname)'. > > > setwd("K:/Bulla/CELfiles") > > raw <- ReadAffy() > > raw > AffyBatch object > size of arrays=834x834 features (10 kb) > cdf=Rat230_2 (31099 affyids) > number of samples=6 > number of genes=31099 > annotation=rat2302 > notes= > > > > gcrma.1 <- gcrma(raw) > Adjusting for optical effect......Done. > Computing affinitiesLoading required package: AnnotationDbi > .Done. > Adjusting for non-specific binding......Done. > Normalizing > Calculating Expression > > > > gcrma.2 <- justGCRMA() > Computing affinities.Done. > Adjusting for optical effect.......Done. > Adjusting for non-specific binding......Done. > Normalizing > Calculating Expression > > > > all.equal(exprs(gcrma.1),exprs(gcrma.2)) > [1] TRUE > > > > sessionInfo() > R version 2.10.1 (2009-12-14) > i386-pc-mingw32 > > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United > States.1252 > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] rat2302probe_2.5.0 AnnotationDbi_1.8.1 rat2302cdf_2.5.0 > gcrma_2.18.1 > [5] affy_1.24.2 Biobase_2.6.1 > > loaded via a namespace (and not attached): > [1] affyio_1.14.0 Biostrings_2.14.10 DBI_0.2-5 > IRanges_1.4.9 > [5] preprocessCore_1.8.0 RSQLite_0.8-0 splines_2.10.1 > tools_2.10.1 > > > > > > > At 04:31 PM 10/28/2009, Jin wrote: >> Jerry <norn2k at="" ...=""> writes: > > Hi, > ? > I'm currently using gcrma >> package in R to summarize probeset intensities from CEL files. >> Surprisingly, > I? found that the results generated from gcrma >> function and justGCRMA function are quite different. In > general,? >> expression values from gcrma function? are lower and? the boxplots >> from gcrma are? quite > asymmetric (no tails? on the bottom).? I >> attached some plots below for your information. This confused > me as >> I thought that the two functions implemented similar algorithms.? > >> ? > Thank you so much for your help! > ? > Best, > Jianjun > > PS: >> The package I used is gcrma 2.12.1. I also observed similar results >> on? 2.14 version. > > Hello, I'm running into the same problem >> with gcrma and justGCRMA. They are not giving the same numeric >> results. Is this an intended feature? I was under the impression that >> these two methods were identical with the exception that justGCRMA was >> more memory efficient because it didn't have to create an AffyBatch >> object. Can anyone shed some light on this subject? Thanks, Jin >> _______________________________________________ Bioconductor mailing >> list Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the >> archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > Jenny Drnevich, Ph.D. > > Functional Genomics Bioinformatics Specialist > W.M. Keck Center for Comparative and Functional Genomics > Roy J. Carver Biotechnology Center > University of Illinois, Urbana-Champaign > > 330 ERML > 1201 W. Gregory Dr. > Urbana, IL 61801 > USA > > ph: 217-244-7355 > fax: 217-265-5066 > e-mail: drnevich at illinois.edu -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD COMMENT
0
Entering edit mode
Thanks Jim, Thanks for the info. I found the question that was posted last last year (and responded to it to start the thread again), but I still can't find your response on BioC. Perhaps it accidently didn't get sent back to the list? :) Jenny At 04:31 PM 2/3/2010, James W. MacDonald wrote: >Hi Jenny, > >Jenny Drnevich wrote: >>Hi everyone, >>I just spent several hours tracking down this >>problem, as I noticed this same discrepancy >>between the results of justGCRMA() and gcrma() >>called on the same data. There was also another >>post about this on Nov 4, 2008, but I couldn't >>find where either of them had ever been answered. > >I think I missed that one, but the question was >asked again and answered by me late last year. > >>HOWEVER, it appears that the discrepancy has >>been fixed in R 2.10.1 (gcrma 2.18.1), but it >>was in R 2.10.0 (gcrma 2.18.0) (examples and sessionInfos below). >>While I'm glad it has been "fixed", where would >>this have been documented? I didn't think this >>type of change would have been included in a >>minor version upgrade. What was the explanation for the original discrepancy? > >The explanation is that justGCRMA() is a >function that I wrote several years ago. In the >intervening period the defaults for gcrma() were >changed without making the corresponding changes to justGCRMA(). > >I made the (unfounded) assumption that the >maintainer of the gcrma package (Jean Wu) was >keeping justGCRMA() consistent with gcrma(), >which as you have seen is not true. Since this >was an oversight by both me and Jean, it wasn't documented anywhere. > >I have since made the two functions consistent >when running under the default arguments. >However, there are some arguments for gcrma() >that are not available for justGCRMA(), so they >are not consistent in all cases. > >Best, > >Jim > > >>Thanks, >>Jenny >>#The examples below are not reproducible >>because you need .CEL files - run them on any that you have. >> > library(gcrma) >>Loading required package: affy >>Loading required package: Biobase >>Welcome to Bioconductor >> Vignettes contain introductory material. To view, type >> 'openVignette()'. To cite Bioconductor, see >> 'citation("Biobase")' and for packages 'citation(pkgname)'. >> > setwd("K:/Bulla/CELfiles") >> > raw <- ReadAffy() >> > raw >>AffyBatch object >>size of arrays=834x834 features (10 kb) >>cdf=Rat230_2 (31099 affyids) >>number of samples=6 >>number of genes=31099 >>annotation=rat2302 >>notes= >> > >> > gcrma.1 <- gcrma(raw) >>Adjusting for optical effect......Done. >>Computing affinitiesLoading required package: AnnotationDbi >>.Done. >>Adjusting for non-specific binding......Done. >>Normalizing >>Calculating Expression >> > >> > gcrma.2 <- justGCRMA() >>Computing affinities.Done. >>Adjusting for optical effect.......Done. >>Adjusting for non-specific binding......Done. >>Normalizing >>Calculating Expression >> > >> > all.equal(exprs(gcrma.1),exprs(gcrma.2)) >>[1] "Mean relative difference: 0.03514035" >> > >> > sessionInfo() >>R version 2.10.0 (2009-10-26) >>i386-pc-mingw32 >>locale: >>[1] LC_COLLATE=English_United >>States.1252 LC_CTYPE=English_United States.1252 >>[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C >>[5] LC_TIME=English_United States.1252 >>attached base packages: >>[1] stats graphics grDevices utils datasets methods base >>other attached packages: >>[1] rat2302probe_2.5.0 AnnotationDbi_1.8.1 rat2302cdf_2.5.0 >>gcrma_2.18.0 >>[5] affy_1.24.2 Biobase_2.6.0 >>loaded via a namespace (and not attached): >>[1] affyio_1.14.0 Biostrings_2.14.3 DBI_0.2-4 >>IRanges_1.4.4 >>[5] preprocessCore_1.8.0 RSQLite_0.7-3 splines_2.10.0 >>tools_2.10.0 >> > >>#Now switch R versions but run same code.... >> > library(gcrma) >>Loading required package: affy >>Loading required package: Biobase >>Welcome to Bioconductor >> Vignettes contain introductory material. To view, type >> 'openVignette()'. To cite Bioconductor, see >> 'citation("Biobase")' and for packages 'citation(pkgname)'. >> > setwd("K:/Bulla/CELfiles") >> > raw <- ReadAffy() >> > raw >>AffyBatch object >>size of arrays=834x834 features (10 kb) >>cdf=Rat230_2 (31099 affyids) >>number of samples=6 >>number of genes=31099 >>annotation=rat2302 >>notes= >> > >> > gcrma.1 <- gcrma(raw) >>Adjusting for optical effect......Done. >>Computing affinitiesLoading required package: AnnotationDbi >>.Done. >>Adjusting for non-specific binding......Done. >>Normalizing >>Calculating Expression >> > >> > gcrma.2 <- justGCRMA() >>Computing affinities.Done. >>Adjusting for optical effect.......Done. >>Adjusting for non-specific binding......Done. >>Normalizing >>Calculating Expression >> > >> > all.equal(exprs(gcrma.1),exprs(gcrma.2)) >>[1] TRUE >> > >> > sessionInfo() >>R version 2.10.1 (2009-12-14) >>i386-pc-mingw32 >>locale: >>[1] LC_COLLATE=English_United >>States.1252 LC_CTYPE=English_United States.1252 >>[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C >>[5] LC_TIME=English_United States.1252 >>attached base packages: >>[1] stats graphics grDevices utils datasets methods base >>other attached packages: >>[1] rat2302probe_2.5.0 AnnotationDbi_1.8.1 rat2302cdf_2.5.0 >>gcrma_2.18.1 >>[5] affy_1.24.2 Biobase_2.6.1 >>loaded via a namespace (and not attached): >>[1] affyio_1.14.0 Biostrings_2.14.10 DBI_0.2-5 >>IRanges_1.4.9 >>[5] preprocessCore_1.8.0 RSQLite_0.8-0 splines_2.10.1 >>tools_2.10.1 >> > >> >> >>At 04:31 PM 10/28/2009, Jin wrote: >>>Jerry <norn2k at="" ...=""> writes: > > Hi, > ? > I'm >>>currently using gcrma package in R to >>>summarize probeset intensities from CEL files. >>>Surprisingly, > I? found that the results >>>generated from gcrma function and justGCRMA >>>function are quite different. In > general,? >>>expression values from gcrma function? are >>>lower and? the boxplots from gcrma are? >>>quite > asymmetric (no tails? on the bottom).? >>>I attached some plots below for your >>>information. This confused > me as I thought >>>that the two functions implemented similar >>>algorithms.? > ? > Thank you so much for >>>your help! > ? > Best, > Jianjun > > PS: The >>>package I used is gcrma 2.12.1. I also >>>observed similar results on? 2.14 >>>version. > > Hello, I'm running into the >>>same problem with gcrma and justGCRMA. They >>>are not giving the same numeric results. Is >>>this an intended feature? I was under the >>>impression that these two methods were >>>identical with the exception that justGCRMA >>>was more memory efficient because it didn't >>>have to create an AffyBatch object. Can anyone >>>shed some light on this subject? Thanks, Jin >>>_______________________________________________ >>> Bioconductor mailing list >>>Bioconductor at stat.math.ethz.ch >>>https://stat.ethz.ch/mailman/listinfo/bioconductor >>>Search the archives: >>>http://news.gmane.org/gmane.science.biology.informatics.conductor >>Jenny Drnevich, Ph.D. >>Functional Genomics Bioinformatics Specialist >>W.M. Keck Center for Comparative and Functional Genomics >>Roy J. Carver Biotechnology Center >>University of Illinois, Urbana-Champaign >>330 ERML >>1201 W. Gregory Dr. >>Urbana, IL 61801 >>USA >>ph: 217-244-7355 >>fax: 217-265-5066 >>e-mail: drnevich at illinois.edu > >-- >James W. MacDonald, M.S. >Biostatistician >Douglas Lab >University of Michigan >Department of Human Genetics >5912 Buhl >1241 E. Catherine St. >Ann Arbor MI 48109-5618 >734-615-7826 >********************************************************** >Electronic Mail is not secure, may not be read >every day, and should not be used for urgent or sensitive issues Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at illinois.edu
ADD REPLY
0
Entering edit mode
Hi Jenny, It wasn't posted to the list because Jin got tired of waiting for a response from the list and emailed me directly, so the correspondence was just between the two of us. It was my oversight to keep the conversation off-list. Below is the transcript of our conversation, for the archives (and apologies for keeping this 'secret'). Hi Jin, I just checked in a modification to justGCRMA that results in identical output as compared to gcrma. There have been quite a few modifications to gcrma since I originally wrote justGCRMA, and it is no longer possible for justGCRMA to do everything that one can do with gcrma. Specifically, in the past gcrma used the MM probes on a chip to estimate background, but this has been changed so the default is to use internal estimates that come with the package. The functionality to use MM probes (or a subset thereof) still exists, but it is too much work to modify justGCRMA to use the MM probes, not to mention the headache of trying to maintain the code. So justGCRMA will only use the internal NSB estimates that come with the package, and will give identical results to gcrma as long as you use the internal NSB estimates for both functions. The new version of gcrma is 2.18.1, and should be available for installation within the next couple of days. Best, Jim >>> >>> Jin <jinxing1986 at="" gmail.com=""> wrote: > > Hello Dr. MacDonald, > > > > I was wondering if you could help me with a problem. I've ran my > > dataset through GCRMA and justGCRMA but I've gotten different > > numerical results from both. I was under the impression that these two > > functions were identical except justGCRMA was more memory efficient > > because it did not have to create an AffyBatch object. I used the > > default parameters for both meaning I did not include any parameters > > at all when calling gcrma() and justgcrma(). > > > > Are they suppose to give identical numerical results? > > > > For example, below are some sample results. The first row is GCRMA and > > the second row is justGCRMA. The columns are my 8 arrays. JustGCRMA > > values are consistently higher than GCRMA values whereas I would have > > expected similar numbers matching the top and bottom rows. > > > > 3.58 3.80 3.83 4.52 3.91 5.17 3.78 6.45 > > 4.79 4.73 5.21 5.68 5.44 5.88 5.11 7.76 > > > > > > > > Thanks, > > Jin > > University of California, San Diego Jenny Drnevich wrote: > Thanks Jim, > > Thanks for the info. I found the question that was posted last last year > (and responded to it to start the thread again), but I still can't find > your response on BioC. Perhaps it accidently didn't get sent back to the > list? > > :) > Jenny > > At 04:31 PM 2/3/2010, James W. MacDonald wrote: >> Hi Jenny, >> >> Jenny Drnevich wrote: >>> Hi everyone, >>> I just spent several hours tracking down this problem, as I noticed >>> this same discrepancy between the results of justGCRMA() and gcrma() >>> called on the same data. There was also another post about this on >>> Nov 4, 2008, but I couldn't find where either of them had ever been >>> answered. >> >> I think I missed that one, but the question was asked again and >> answered by me late last year. >> >>> HOWEVER, it appears that the discrepancy has been fixed in R 2.10.1 >>> (gcrma 2.18.1), but it was in R 2.10.0 (gcrma 2.18.0) (examples and >>> sessionInfos below). >>> While I'm glad it has been "fixed", where would this have been >>> documented? I didn't think this type of change would have been >>> included in a minor version upgrade. What was the explanation for the >>> original discrepancy? >> >> The explanation is that justGCRMA() is a function that I wrote several >> years ago. In the intervening period the defaults for gcrma() were >> changed without making the corresponding changes to justGCRMA(). >> >> I made the (unfounded) assumption that the maintainer of the gcrma >> package (Jean Wu) was keeping justGCRMA() consistent with gcrma(), >> which as you have seen is not true. Since this was an oversight by >> both me and Jean, it wasn't documented anywhere. >> >> I have since made the two functions consistent when running under the >> default arguments. However, there are some arguments for gcrma() that >> are not available for justGCRMA(), so they are not consistent in all >> cases. >> >> Best, >> >> Jim >> >> >>> Thanks, >>> Jenny >>> #The examples below are not reproducible because you need .CEL files >>> - run them on any that you have. >>> > library(gcrma) >>> Loading required package: affy >>> Loading required package: Biobase >>> Welcome to Bioconductor >>> Vignettes contain introductory material. To view, type >>> 'openVignette()'. To cite Bioconductor, see >>> 'citation("Biobase")' and for packages 'citation(pkgname)'. >>> > setwd("K:/Bulla/CELfiles") >>> > raw <- ReadAffy() >>> > raw >>> AffyBatch object >>> size of arrays=834x834 features (10 kb) >>> cdf=Rat230_2 (31099 affyids) >>> number of samples=6 >>> number of genes=31099 >>> annotation=rat2302 >>> notes= >>> > >>> > gcrma.1 <- gcrma(raw) >>> Adjusting for optical effect......Done. >>> Computing affinitiesLoading required package: AnnotationDbi >>> .Done. >>> Adjusting for non-specific binding......Done. >>> Normalizing >>> Calculating Expression >>> > >>> > gcrma.2 <- justGCRMA() >>> Computing affinities.Done. >>> Adjusting for optical effect.......Done. >>> Adjusting for non-specific binding......Done. >>> Normalizing >>> Calculating Expression >>> > >>> > all.equal(exprs(gcrma.1),exprs(gcrma.2)) >>> [1] "Mean relative difference: 0.03514035" >>> > >>> > sessionInfo() >>> R version 2.10.0 (2009-10-26) >>> i386-pc-mingw32 >>> locale: >>> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United >>> States.1252 >>> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C >>> [5] LC_TIME=English_United States.1252 >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> other attached packages: >>> [1] rat2302probe_2.5.0 AnnotationDbi_1.8.1 rat2302cdf_2.5.0 >>> gcrma_2.18.0 >>> [5] affy_1.24.2 Biobase_2.6.0 >>> loaded via a namespace (and not attached): >>> [1] affyio_1.14.0 Biostrings_2.14.3 DBI_0.2-4 >>> IRanges_1.4.4 >>> [5] preprocessCore_1.8.0 RSQLite_0.7-3 splines_2.10.0 >>> tools_2.10.0 >>> > >>> #Now switch R versions but run same code.... >>> > library(gcrma) >>> Loading required package: affy >>> Loading required package: Biobase >>> Welcome to Bioconductor >>> Vignettes contain introductory material. To view, type >>> 'openVignette()'. To cite Bioconductor, see >>> 'citation("Biobase")' and for packages 'citation(pkgname)'. >>> > setwd("K:/Bulla/CELfiles") >>> > raw <- ReadAffy() >>> > raw >>> AffyBatch object >>> size of arrays=834x834 features (10 kb) >>> cdf=Rat230_2 (31099 affyids) >>> number of samples=6 >>> number of genes=31099 >>> annotation=rat2302 >>> notes= >>> > >>> > gcrma.1 <- gcrma(raw) >>> Adjusting for optical effect......Done. >>> Computing affinitiesLoading required package: AnnotationDbi >>> .Done. >>> Adjusting for non-specific binding......Done. >>> Normalizing >>> Calculating Expression >>> > >>> > gcrma.2 <- justGCRMA() >>> Computing affinities.Done. >>> Adjusting for optical effect.......Done. >>> Adjusting for non-specific binding......Done. >>> Normalizing >>> Calculating Expression >>> > >>> > all.equal(exprs(gcrma.1),exprs(gcrma.2)) >>> [1] TRUE >>> > >>> > sessionInfo() >>> R version 2.10.1 (2009-12-14) >>> i386-pc-mingw32 >>> locale: >>> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United >>> States.1252 >>> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C >>> [5] LC_TIME=English_United States.1252 >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> other attached packages: >>> [1] rat2302probe_2.5.0 AnnotationDbi_1.8.1 rat2302cdf_2.5.0 >>> gcrma_2.18.1 >>> [5] affy_1.24.2 Biobase_2.6.1 >>> loaded via a namespace (and not attached): >>> [1] affyio_1.14.0 Biostrings_2.14.10 DBI_0.2-5 >>> IRanges_1.4.9 >>> [5] preprocessCore_1.8.0 RSQLite_0.8-0 splines_2.10.1 >>> tools_2.10.1 >>> > >>> >>> >>> At 04:31 PM 10/28/2009, Jin wrote: >>>> Jerry <norn2k at="" ...=""> writes: > > Hi, > ? > I'm currently using gcrma >>>> package in R to summarize probeset intensities from CEL files. >>>> Surprisingly, > I? found that the results generated from gcrma >>>> function and justGCRMA function are quite different. In > general,? >>>> expression values from gcrma function? are lower and? the boxplots >>>> from gcrma are? quite > asymmetric (no tails? on the bottom).? I >>>> attached some plots below for your information. This confused > me >>>> as I thought that the two functions implemented similar >>>> algorithms.? > ? > Thank you so much for your help! > ? > Best, > >>>> Jianjun > > PS: The package I used is gcrma 2.12.1. I also observed >>>> similar results on? 2.14 version. > > Hello, I'm running into >>>> the same problem with gcrma and justGCRMA. They are not giving the >>>> same numeric results. Is this an intended feature? I was under the >>>> impression that these two methods were identical with the exception >>>> that justGCRMA was more memory efficient because it didn't have to >>>> create an AffyBatch object. Can anyone shed some light on this >>>> subject? Thanks, Jin _______________________________________________ >>>> Bioconductor mailing list Bioconductor at stat.math.ethz.ch >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the >>>> archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> Jenny Drnevich, Ph.D. >>> Functional Genomics Bioinformatics Specialist >>> W.M. Keck Center for Comparative and Functional Genomics >>> Roy J. Carver Biotechnology Center >>> University of Illinois, Urbana-Champaign >>> 330 ERML >>> 1201 W. Gregory Dr. >>> Urbana, IL 61801 >>> USA >>> ph: 217-244-7355 >>> fax: 217-265-5066 >>> e-mail: drnevich at illinois.edu >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> Douglas Lab >> University of Michigan >> Department of Human Genetics >> 5912 Buhl >> 1241 E. Catherine St. >> Ann Arbor MI 48109-5618 >> 734-615-7826 >> ********************************************************** >> Electronic Mail is not secure, may not be read every day, and should >> not be used for urgent or sensitive issues > > Jenny Drnevich, Ph.D. > > Functional Genomics Bioinformatics Specialist > W.M. Keck Center for Comparative and Functional Genomics > Roy J. Carver Biotechnology Center > University of Illinois, Urbana-Champaign > > 330 ERML > 1201 W. Gregory Dr. > Urbana, IL 61801 > USA > > ph: 217-244-7355 > fax: 217-265-5066 > e-mail: drnevich at illinois.edu -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD REPLY
0
Entering edit mode
FYI, the aroma.affymetrix package can reproduce the gcrma() estimates with great precision, cf. http://www.aroma-project.org/replication/gcRMA It does this by utilizing the gcrma functions while running in bounded memory (~1 GB RAM) regardless of the number of arrays. /Henrik On Thu, Feb 4, 2010 at 2:56 PM, James W. MacDonald <jmacdon at="" med.umich.edu=""> wrote: > Hi Jenny, > > It wasn't posted to the list because Jin got tired of waiting for a response > from the list and emailed me directly, so the correspondence was just > between the two of us. > > It was my oversight to keep the conversation off-list. Below is the > transcript of our conversation, for the archives (and apologies for keeping > this 'secret'). > > Hi Jin, > > I just checked in a modification to justGCRMA that results in identical > output as compared to gcrma. There have been quite a few modifications to > gcrma since I originally wrote justGCRMA, and it is no longer possible for > justGCRMA to do everything that one can do with gcrma. > > Specifically, in the past gcrma used the MM probes on a chip to estimate > background, but this has been changed so the default is to use internal > estimates that come with the package. The functionality to use MM probes (or > a subset thereof) still exists, but it is too much work to modify justGCRMA > to use the MM probes, not to mention the headache of trying to maintain the > code. So justGCRMA will only use the internal NSB estimates that come with > the package, and will give identical results to gcrma as long as you use the > internal NSB estimates for both functions. > > The new version of gcrma is 2.18.1, and should be available for installation > within the next couple of days. > > Best, > > Jim > > >>>> >>> Jin <jinxing1986 at="" gmail.com=""> wrote: >> > Hello Dr. MacDonald, >> > >> > I was wondering if you could help me with a problem. I've ran my >> > dataset through GCRMA and justGCRMA but I've gotten different >> > numerical results from both. I was under the impression that these two >> > functions were identical except justGCRMA was more memory efficient >> > because it did not have to create an AffyBatch object. I used the >> > default parameters for both meaning I did not include any parameters >> > at all when calling gcrma() and justgcrma(). >> > >> > Are they suppose to give identical numerical results? >> > >> > For example, below are some sample results. The first row is GCRMA and >> > the second row is justGCRMA. The columns are my 8 arrays. JustGCRMA >> > values are consistently higher than GCRMA values whereas I would have >> > expected similar numbers matching the top and bottom rows. >> > >> > 3.58 ? ? ? ?3.80 ? ?3.83 ? ?4.52 ? ?3.91 ? ?5.17 ? ?3.78 ? ?6.45 >> > 4.79 ? ? ? ?4.73 ? ?5.21 ? ?5.68 ? ?5.44 ? ?5.88 ? ?5.11 ? ?7.76 >> > >> > >> > >> > Thanks, >> > Jin >> > University of California, San Diego > > > Jenny Drnevich wrote: >> >> Thanks Jim, >> >> Thanks for the info. I found the question that was posted last last year >> (and responded to it to start the thread again), but I still can't find your >> response on BioC. Perhaps it accidently didn't get sent back to the list? >> >> :) >> Jenny >> >> At 04:31 PM 2/3/2010, James W. MacDonald wrote: >>> >>> Hi Jenny, >>> >>> Jenny Drnevich wrote: >>>> >>>> Hi everyone, >>>> I just spent several hours tracking down this problem, as I noticed this >>>> same discrepancy between the results of justGCRMA() and gcrma() called on >>>> the same data. There was also another post about this on Nov 4, 2008, but I >>>> couldn't find where either of them had ever been answered. >>> >>> I think I missed that one, but the question was asked again and answered >>> by me late last year. >>> >>>> HOWEVER, it appears that the discrepancy has been fixed in R 2.10.1 >>>> (gcrma 2.18.1), but it was in R 2.10.0 (gcrma 2.18.0) (examples and >>>> sessionInfos below). >>>> While I'm glad it has been "fixed", where would this have been >>>> documented? I didn't think this type of change would have been included in a >>>> minor version upgrade. What was the explanation for the original >>>> discrepancy? >>> >>> The explanation is that justGCRMA() is a function that I wrote several >>> years ago. In the intervening period the defaults for gcrma() were changed >>> without making the corresponding changes to justGCRMA(). >>> >>> I made the (unfounded) assumption that the maintainer of the gcrma >>> package (Jean Wu) was keeping justGCRMA() consistent with gcrma(), which as >>> you have seen is not true. Since this was an oversight by both me and Jean, >>> it wasn't documented anywhere. >>> >>> I have since made the two functions consistent when running under the >>> default arguments. However, there are some arguments for gcrma() that are >>> not available for justGCRMA(), so they are not consistent in all cases. >>> >>> Best, >>> >>> Jim >>> >>> >>>> Thanks, >>>> Jenny >>>> #The examples below are not reproducible because you need .CEL files - >>>> run them on any that you have. >>>> ?> library(gcrma) >>>> Loading required package: affy >>>> Loading required package: Biobase >>>> Welcome to Bioconductor >>>> ?Vignettes contain introductory material. To view, type >>>> ?'openVignette()'. To cite Bioconductor, see >>>> ?'citation("Biobase")' and for packages 'citation(pkgname)'. >>>> ?> setwd("K:/Bulla/CELfiles") >>>> ?> raw <- ReadAffy() >>>> ?> raw >>>> AffyBatch object >>>> size of arrays=834x834 features (10 kb) >>>> cdf=Rat230_2 (31099 affyids) >>>> number of samples=6 >>>> number of genes=31099 >>>> annotation=rat2302 >>>> notes= >>>> ?> >>>> ?> gcrma.1 <- gcrma(raw) >>>> Adjusting for optical effect......Done. >>>> Computing affinitiesLoading required package: AnnotationDbi >>>> .Done. >>>> Adjusting for non-specific binding......Done. >>>> Normalizing >>>> Calculating Expression >>>> ?> >>>> ?> gcrma.2 <- justGCRMA() >>>> Computing affinities.Done. >>>> Adjusting for optical effect.......Done. >>>> Adjusting for non-specific binding......Done. >>>> Normalizing >>>> Calculating Expression >>>> ?> >>>> ?> all.equal(exprs(gcrma.1),exprs(gcrma.2)) >>>> [1] "Mean relative difference: 0.03514035" >>>> ?> >>>> ?> sessionInfo() >>>> R version 2.10.0 (2009-10-26) >>>> i386-pc-mingw32 >>>> locale: >>>> [1] LC_COLLATE=English_United States.1252 ?LC_CTYPE=English_United >>>> States.1252 >>>> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C >>>> [5] LC_TIME=English_United States.1252 >>>> attached base packages: >>>> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >>>> other attached packages: >>>> [1] rat2302probe_2.5.0 ?AnnotationDbi_1.8.1 rat2302cdf_2.5.0 >>>> gcrma_2.18.0 >>>> [5] affy_1.24.2 ? ? ? ? Biobase_2.6.0 >>>> loaded via a namespace (and not attached): >>>> [1] affyio_1.14.0 ? ? ? ?Biostrings_2.14.3 DBI_0.2-4 >>>> IRanges_1.4.4 >>>> [5] preprocessCore_1.8.0 RSQLite_0.7-3 ? ? ? ?splines_2.10.0 >>>> tools_2.10.0 >>>> ?> >>>> #Now switch R versions but run same code.... >>>> ?> library(gcrma) >>>> Loading required package: affy >>>> Loading required package: Biobase >>>> Welcome to Bioconductor >>>> ?Vignettes contain introductory material. To view, type >>>> ?'openVignette()'. To cite Bioconductor, see >>>> ?'citation("Biobase")' and for packages 'citation(pkgname)'. >>>> ?> setwd("K:/Bulla/CELfiles") >>>> ?> raw <- ReadAffy() >>>> ?> raw >>>> AffyBatch object >>>> size of arrays=834x834 features (10 kb) >>>> cdf=Rat230_2 (31099 affyids) >>>> number of samples=6 >>>> number of genes=31099 >>>> annotation=rat2302 >>>> notes= >>>> ?> >>>> ?> gcrma.1 <- gcrma(raw) >>>> Adjusting for optical effect......Done. >>>> Computing affinitiesLoading required package: AnnotationDbi >>>> .Done. >>>> Adjusting for non-specific binding......Done. >>>> Normalizing >>>> Calculating Expression >>>> ?> >>>> ?> gcrma.2 <- justGCRMA() >>>> Computing affinities.Done. >>>> Adjusting for optical effect.......Done. >>>> Adjusting for non-specific binding......Done. >>>> Normalizing >>>> Calculating Expression >>>> ?> >>>> ?> all.equal(exprs(gcrma.1),exprs(gcrma.2)) >>>> [1] TRUE >>>> ?> >>>> ?> sessionInfo() >>>> R version 2.10.1 (2009-12-14) >>>> i386-pc-mingw32 >>>> locale: >>>> [1] LC_COLLATE=English_United States.1252 ?LC_CTYPE=English_United >>>> States.1252 >>>> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C >>>> [5] LC_TIME=English_United States.1252 >>>> attached base packages: >>>> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >>>> other attached packages: >>>> [1] rat2302probe_2.5.0 ?AnnotationDbi_1.8.1 rat2302cdf_2.5.0 >>>> gcrma_2.18.1 >>>> [5] affy_1.24.2 ? ? ? ? Biobase_2.6.1 >>>> loaded via a namespace (and not attached): >>>> [1] affyio_1.14.0 ? ? ? ?Biostrings_2.14.10 DBI_0.2-5 >>>> IRanges_1.4.9 >>>> [5] preprocessCore_1.8.0 RSQLite_0.8-0 ? ? ? ?splines_2.10.1 >>>> tools_2.10.1 >>>> ?> >>>> >>>> >>>> At 04:31 PM 10/28/2009, Jin wrote: >>>>> >>>>> Jerry <norn2k at="" ...=""> writes: > > Hi, > ? ?> I'm currently using gcrma >>>>> package in R to summarize probeset intensities from CEL files. Surprisingly, >>>>> > I? found that the results generated from gcrma function and justGCRMA >>>>> function are quite different. In > general,? expression values from gcrma >>>>> function? are lower and? the boxplots from gcrma are? quite > asymmetric (no >>>>> tails? on the bottom).? I attached some plots below for your information. >>>>> This confused > me as I thought that the two functions implemented similar >>>>> algorithms.? ?> ? ?> Thank you so much for your help! > ? ?> Best, > Jianjun >>>>> > > PS: The package I used is gcrma 2.12.1. I also observed similar results >>>>> on? 2.14 version. > > ? ? ? Hello, I'm running into the same problem with >>>>> gcrma and justGCRMA. They are not giving the same numeric results. Is this >>>>> an intended feature? I was under the impression that these two methods were >>>>> identical with the exception that justGCRMA was more memory efficient >>>>> because it didn't have to create an AffyBatch object. Can anyone shed some >>>>> light on this subject? Thanks, Jin >>>>> _______________________________________________ ?Bioconductor mailing list >>>>> Bioconductor at stat.math.ethz.ch >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>>> Jenny Drnevich, Ph.D. >>>> Functional Genomics Bioinformatics Specialist >>>> W.M. Keck Center for Comparative and Functional Genomics >>>> Roy J. Carver Biotechnology Center >>>> University of Illinois, Urbana-Champaign >>>> 330 ERML >>>> 1201 W. Gregory Dr. >>>> Urbana, IL 61801 >>>> USA >>>> ph: 217-244-7355 >>>> fax: 217-265-5066 >>>> e-mail: drnevich at illinois.edu >>> >>> -- >>> James W. MacDonald, M.S. >>> Biostatistician >>> Douglas Lab >>> University of Michigan >>> Department of Human Genetics >>> 5912 Buhl >>> 1241 E. Catherine St. >>> Ann Arbor MI 48109-5618 >>> 734-615-7826 >>> ********************************************************** >>> Electronic Mail is not secure, may not be read every day, and should not >>> be used for urgent or sensitive issues >> >> Jenny Drnevich, Ph.D. >> >> Functional Genomics Bioinformatics Specialist >> W.M. Keck Center for Comparative and Functional Genomics >> Roy J. Carver Biotechnology Center >> University of Illinois, Urbana-Champaign >> >> 330 ERML >> 1201 W. Gregory Dr. >> Urbana, IL 61801 >> USA >> >> ph: 217-244-7355 >> fax: 217-265-5066 >> e-mail: drnevich at illinois.edu > > -- > James W. MacDonald, M.S. > Biostatistician > Douglas Lab > University of Michigan > Department of Human Genetics > 5912 Buhl > 1241 E. Catherine St. > Ann Arbor MI 48109-5618 > 734-615-7826 > ********************************************************** > Electronic Mail is not secure, may not be read every day, and should not be > used for urgent or sensitive issues > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 806 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6