Question: limma: cannot repoduce older analysis
0
10.8 years ago by
Philipp Pagel190
Philipp Pagel190 wrote:
Dear list, About 3 months ago I analyzed a simple two-color array experiment and got results that looked quite reasonable and biologically sound. For some reason I wanted to repeat the analysis and add a few plots that I had not included before. When I got VERY different results in my toptable, I assumed I must have changed something in my approach so I simply ran my original analysis script again and found I was unable to reproduce the original toptable. I have spent quite some time trying to debug the problem and have to say that I am stuck. I have the original data files and the original R-script. The normalization is 100% reproducible - i.e. the normalized MALists seem to be identical. Yet when searching for differential expression I get totally different results. The only difference between the two runs lies in updates to R and limma in the meantime. Unfortunately, I did not record which version of R, limma etc. I had used, originally. My current environment is this: > sessionInfo() R version 2.7.1 (2008-06-23) x86_64-pc-linux-gnu locale: LC_CTYPE=en_US.utf8;LC_NUMERIC=C;LC_TIME=en_US.utf8;LC_COLLATE =en_US.utf8;LC_MONETARY=C;LC_MESSAGES=en_US.utf8;LC_PAPER=en_US.utf8;L C_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.utf8;LC_IDEN TIFICATION=C attached base packages: [1] splines stats graphics utils datasets grDevices methods base other attached packages: [1] statmod_1.3.6 MASS_7.2-42 xtable_1.5-2 limma_2.14.2 lattice_0.17-10 [6] cairoDevice_2.8 loaded via a namespace (and not attached): [1] grid_2.7.1 tools_2.7.1 My search for differential expression seems pretty standard to me: MA$design <- modelMatrix(targets, ref="control") # flag out controls etc. MA$weights[MA$genes$Status != 'miRNA', ] = 0.0 # sort spots by ID to put replicates next to each other MA2 <- MA[order(MA$genes$ID), ] dupfit <- duplicateCorrelation(MA2, ndups=4) fit <- lmFit(MA2, ndups=4, correlation=dupfit$consensus) fit <- eBayes(fit) tt <- topTable(fit, number=100) I have siftet through the changelog of limma hoping to find a hint about some changed default or behaviour in lmFit or eBayes but saw nothing that seemed to expain my problem. Any hints apprechiated. cu Philipp -- Dr. Philipp Pagel Lehrstuhl f?r Genomorientierte Bioinformatik Technische Universit?t M?nchen Wissenschaftszentrum Weihenstephan 85350 Freising, Germany http://mips.gsf.de/staff/pagel normalization limma • 470 views ADD COMMENTlink modified 10.8 years ago by Wolfgang Huber13k • written 10.8 years ago by Philipp Pagel190 Answer: limma: cannot repoduce older analysis 0 10.8 years ago by EMBL European Molecular Biology Laboratory Wolfgang Huber13k wrote: Hi Philipp the quickest way to figure out what's going on may be to download and install an older version of R (as good as you can remember what it was), run biocLite("limma") and then your script and see exactly at what point your results start to differ. Is "targets" the same? (esp if you imported it from text file via read.table or the like, there can be surprises e.g. to do with locales/encoding, special characters...) Best wishes Wolfgang ------------------------------------------------------------------ Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber 24/07/2008 11:23 Philipp Pagel scripsit > Dear list, > > About 3 months ago I analyzed a simple two-color array experiment and got > results that looked quite reasonable and biologically sound. For some reason I > wanted to repeat the analysis and add a few plots that I had not included > before. > > When I got VERY different results in my toptable, I assumed I must have > changed something in my approach so I simply ran my original analysis > script again and found I was unable to reproduce the original toptable. > I have spent quite some time trying to debug the problem and have to say > that I am stuck. I have the original data files and the original > R-script. The normalization is 100% reproducible - i.e. the normalized > MALists seem to be identical. Yet when searching for differential > expression I get totally different results. > > The only difference between the two runs lies in updates to R and limma in > the meantime. Unfortunately, I did not record which version of R, limma etc. I > had used, originally. My current environment is this: > > > sessionInfo() > R version 2.7.1 (2008-06-23) > x86_64-pc-linux-gnu > > locale: > LC_CTYPE=en_US.utf8;LC_NUMERIC=C;LC_TIME=en_US.utf8;LC_COLLATE =en_US.utf8;LC_MONETARY=C;LC_MESSAGES=en_US.utf8;LC_PAPER=en_US.utf8;L C_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.utf8;LC_IDEN TIFICATION=C > > attached base packages: > [1] splines stats graphics utils datasets grDevices methods base > > other attached packages: > [1] statmod_1.3.6 MASS_7.2-42 xtable_1.5-2 limma_2.14.2 lattice_0.17-10 > [6] cairoDevice_2.8 > > loaded via a namespace (and not attached): > [1] grid_2.7.1 tools_2.7.1 > > > My search for differential expression seems pretty standard to me: > > MA$design <- modelMatrix(targets, ref="control") > # flag out controls etc. > MA$weights[MA$genes$Status != 'miRNA', ] = 0.0 > # sort spots by ID to put replicates next to each other > MA2 <- MA[order(MA$genes$ID), ] > > dupfit <- duplicateCorrelation(MA2, ndups=4) > fit <- lmFit(MA2, ndups=4, correlation=dupfit$consensus) > fit <- eBayes(fit) > tt <- topTable(fit, number=100) > > I have siftet through the changelog of limma hoping to find a hint about > some changed default or behaviour in lmFit or eBayes but saw nothing > that seemed to expain my problem. > > Any hints apprechiated. > > cu > Philipp >
Additional to the points mentioned, you can check the log of changes to limma, e.g. changeLog(n=200) To determine what functions may have changed in the time period. When comparing your analysis from a reproducible research point of view if you have any old objects stored (saved R session or objects dumped to disk) you can using the functions "identical" and "all.equal" to examine where analysis might deviate when trying to reproduce it. Marcus On 24/7/08 10:41 PM, "Wolfgang Huber" <huber at="" ebi.ac.uk=""> wrote: > Hi Philipp > > the quickest way to figure out what's going on may be to download and > install an older version of R (as good as you can remember what it was), > run > biocLite("limma") > and then your script and see exactly at what point your results start to > differ. > > Is "targets" the same? (esp if you imported it from text file via > read.table or the like, there can be surprises e.g. to do with > locales/encoding, special characters...) > > Best wishes > Wolfgang > > ------------------------------------------------------------------ > Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber > > > 24/07/2008 11:23 Philipp Pagel scripsit >> Dear list, >> >> About 3 months ago I analyzed a simple two-color array experiment and got >> results that looked quite reasonable and biologically sound. For some reason >> I >> wanted to repeat the analysis and add a few plots that I had not included >> before. >> >> When I got VERY different results in my toptable, I assumed I must have >> changed something in my approach so I simply ran my original analysis >> script again and found I was unable to reproduce the original toptable. >> I have spent quite some time trying to debug the problem and have to say >> that I am stuck. I have the original data files and the original >> R-script. The normalization is 100% reproducible - i.e. the normalized >> MALists seem to be identical. Yet when searching for differential >> expression I get totally different results. >> >> The only difference between the two runs lies in updates to R and limma in >> the meantime. Unfortunately, I did not record which version of R, limma etc. >> I >> had used, originally. My current environment is this: >> >>> sessionInfo() >> R version 2.7.1 (2008-06-23) >> x86_64-pc-linux-gnu >> >> locale: >> LC_CTYPE=en_US.utf8;LC_NUMERIC=C;LC_TIME=en_US.utf8;LC_COLLATE=en_U S.utf8;LC_ >> MONETARY=C;LC_MESSAGES=en_US.utf8;LC_PAPER=en_US.utf8;LC_NAME=C;LC_ ADDRESS=C; >> LC_TELEPHONE=C;LC_MEASUREMENT=en_US.utf8;LC_IDENTIFICATION=C >> >> attached base packages: >> [1] splines stats graphics utils datasets grDevices methods >> base >> >> other attached packages: >> [1] statmod_1.3.6 MASS_7.2-42 xtable_1.5-2 limma_2.14.2 >> lattice_0.17-10 >> [6] cairoDevice_2.8 >> >> loaded via a namespace (and not attached): >> [1] grid_2.7.1 tools_2.7.1 >> >> >> My search for differential expression seems pretty standard to me: >> >> MA$design <- modelMatrix(targets, ref="control") >> # flag out controls etc. >> MA$weights[MA$genes$Status != 'miRNA', ] = 0.0 >> # sort spots by ID to put replicates next to each other >> MA2 <- MA[order(MA$genes$ID), ] >> >> dupfit <- duplicateCorrelation(MA2, ndups=4) >> fit <- lmFit(MA2, ndups=4, correlation=dupfit\$consensus) >> fit <- eBayes(fit) >> tt <- topTable(fit, number=100) >> >> I have siftet through the changelog of limma hoping to find a hint about >> some changed default or behaviour in lmFit or eBayes but saw nothing >> that seemed to expain my problem. >> >> Any hints apprechiated. >> >> cu >> Philipp >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor The contents of this e-mail are privileged and/or confid...{{dropped:5}}