limma: cannot repoduce older analysis
1
0
Entering edit mode
Philipp Pagel ▴ 190
@philipp-pagel-2810
Last seen 10.2 years ago
Dear list, About 3 months ago I analyzed a simple two-color array experiment and got results that looked quite reasonable and biologically sound. For some reason I wanted to repeat the analysis and add a few plots that I had not included before. When I got VERY different results in my toptable, I assumed I must have changed something in my approach so I simply ran my original analysis script again and found I was unable to reproduce the original toptable. I have spent quite some time trying to debug the problem and have to say that I am stuck. I have the original data files and the original R-script. The normalization is 100% reproducible - i.e. the normalized MALists seem to be identical. Yet when searching for differential expression I get totally different results. The only difference between the two runs lies in updates to R and limma in the meantime. Unfortunately, I did not record which version of R, limma etc. I had used, originally. My current environment is this: > sessionInfo() R version 2.7.1 (2008-06-23) x86_64-pc-linux-gnu locale: LC_CTYPE=en_US.utf8;LC_NUMERIC=C;LC_TIME=en_US.utf8;LC_COLLATE =en_US.utf8;LC_MONETARY=C;LC_MESSAGES=en_US.utf8;LC_PAPER=en_US.utf8;L C_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.utf8;LC_IDEN TIFICATION=C attached base packages: [1] splines stats graphics utils datasets grDevices methods base other attached packages: [1] statmod_1.3.6 MASS_7.2-42 xtable_1.5-2 limma_2.14.2 lattice_0.17-10 [6] cairoDevice_2.8 loaded via a namespace (and not attached): [1] grid_2.7.1 tools_2.7.1 My search for differential expression seems pretty standard to me: MA$design <- modelMatrix(targets, ref="control") # flag out controls etc. MA$weights[MA$genes$Status != 'miRNA', ] = 0.0 # sort spots by ID to put replicates next to each other MA2 <- MA[order(MA$genes$ID), ] dupfit <- duplicateCorrelation(MA2, ndups=4) fit <- lmFit(MA2, ndups=4, correlation=dupfit$consensus) fit <- eBayes(fit) tt <- topTable(fit, number=100) I have siftet through the changelog of limma hoping to find a hint about some changed default or behaviour in lmFit or eBayes but saw nothing that seemed to expain my problem. Any hints apprechiated. cu Philipp -- Dr. Philipp Pagel Lehrstuhl f?r Genomorientierte Bioinformatik Technische Universit?t M?nchen Wissenschaftszentrum Weihenstephan 85350 Freising, Germany http://mips.gsf.de/staff/pagel
Normalization limma Normalization limma • 1.4k views
ADD COMMENT
0
Entering edit mode
@wolfgang-huber-3550
Last seen 3 months ago
EMBL European Molecular Biology Laborat…
Hi Philipp the quickest way to figure out what's going on may be to download and install an older version of R (as good as you can remember what it was), run biocLite("limma") and then your script and see exactly at what point your results start to differ. Is "targets" the same? (esp if you imported it from text file via read.table or the like, there can be surprises e.g. to do with locales/encoding, special characters...) Best wishes Wolfgang ------------------------------------------------------------------ Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber 24/07/2008 11:23 Philipp Pagel scripsit > Dear list, > > About 3 months ago I analyzed a simple two-color array experiment and got > results that looked quite reasonable and biologically sound. For some reason I > wanted to repeat the analysis and add a few plots that I had not included > before. > > When I got VERY different results in my toptable, I assumed I must have > changed something in my approach so I simply ran my original analysis > script again and found I was unable to reproduce the original toptable. > I have spent quite some time trying to debug the problem and have to say > that I am stuck. I have the original data files and the original > R-script. The normalization is 100% reproducible - i.e. the normalized > MALists seem to be identical. Yet when searching for differential > expression I get totally different results. > > The only difference between the two runs lies in updates to R and limma in > the meantime. Unfortunately, I did not record which version of R, limma etc. I > had used, originally. My current environment is this: > > > sessionInfo() > R version 2.7.1 (2008-06-23) > x86_64-pc-linux-gnu > > locale: > LC_CTYPE=en_US.utf8;LC_NUMERIC=C;LC_TIME=en_US.utf8;LC_COLLATE =en_US.utf8;LC_MONETARY=C;LC_MESSAGES=en_US.utf8;LC_PAPER=en_US.utf8;L C_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.utf8;LC_IDEN TIFICATION=C > > attached base packages: > [1] splines stats graphics utils datasets grDevices methods base > > other attached packages: > [1] statmod_1.3.6 MASS_7.2-42 xtable_1.5-2 limma_2.14.2 lattice_0.17-10 > [6] cairoDevice_2.8 > > loaded via a namespace (and not attached): > [1] grid_2.7.1 tools_2.7.1 > > > My search for differential expression seems pretty standard to me: > > MA$design <- modelMatrix(targets, ref="control") > # flag out controls etc. > MA$weights[MA$genes$Status != 'miRNA', ] = 0.0 > # sort spots by ID to put replicates next to each other > MA2 <- MA[order(MA$genes$ID), ] > > dupfit <- duplicateCorrelation(MA2, ndups=4) > fit <- lmFit(MA2, ndups=4, correlation=dupfit$consensus) > fit <- eBayes(fit) > tt <- topTable(fit, number=100) > > I have siftet through the changelog of limma hoping to find a hint about > some changed default or behaviour in lmFit or eBayes but saw nothing > that seemed to expain my problem. > > Any hints apprechiated. > > cu > Philipp >
ADD COMMENT
0
Entering edit mode
Hi Wolfgang, > the quickest way to figure out what's going on may be to download and > install an older version of R (as good as you can remember what it was), > run > biocLite("limma") > and then your script and see exactly at what point your results start to > differ. Good advice - thanks! I tried R 2.6.2 / limma_2.12.0 and was able to nail down the difference to lmFit() which looks like it has undergone a major re-write somewhere between 2.12.0 and 2.14.2. So now I'll spend some time studying the source code - maybe I can figure out what exactly causes the change. If anyone else has experienced a similar effect I'd be interested to hear about it... > Is "targets" the same? Yes, it is. cu Philipp -- Dr. Philipp Pagel Lehrstuhl f?r Genomorientierte Bioinformatik Technische Universit?t M?nchen Wissenschaftszentrum Weihenstephan 85350 Freising, Germany http://mips.gsf.de/staff/pagel
ADD REPLY
0
Entering edit mode
Additional to the points mentioned, you can check the log of changes to limma, e.g. changeLog(n=200) To determine what functions may have changed in the time period. When comparing your analysis from a reproducible research point of view if you have any old objects stored (saved R session or objects dumped to disk) you can using the functions "identical" and "all.equal" to examine where analysis might deviate when trying to reproduce it. Marcus On 24/7/08 10:41 PM, "Wolfgang Huber" <huber at="" ebi.ac.uk=""> wrote: > Hi Philipp > > the quickest way to figure out what's going on may be to download and > install an older version of R (as good as you can remember what it was), > run > biocLite("limma") > and then your script and see exactly at what point your results start to > differ. > > Is "targets" the same? (esp if you imported it from text file via > read.table or the like, there can be surprises e.g. to do with > locales/encoding, special characters...) > > Best wishes > Wolfgang > > ------------------------------------------------------------------ > Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber > > > 24/07/2008 11:23 Philipp Pagel scripsit >> Dear list, >> >> About 3 months ago I analyzed a simple two-color array experiment and got >> results that looked quite reasonable and biologically sound. For some reason >> I >> wanted to repeat the analysis and add a few plots that I had not included >> before. >> >> When I got VERY different results in my toptable, I assumed I must have >> changed something in my approach so I simply ran my original analysis >> script again and found I was unable to reproduce the original toptable. >> I have spent quite some time trying to debug the problem and have to say >> that I am stuck. I have the original data files and the original >> R-script. The normalization is 100% reproducible - i.e. the normalized >> MALists seem to be identical. Yet when searching for differential >> expression I get totally different results. >> >> The only difference between the two runs lies in updates to R and limma in >> the meantime. Unfortunately, I did not record which version of R, limma etc. >> I >> had used, originally. My current environment is this: >> >>> sessionInfo() >> R version 2.7.1 (2008-06-23) >> x86_64-pc-linux-gnu >> >> locale: >> LC_CTYPE=en_US.utf8;LC_NUMERIC=C;LC_TIME=en_US.utf8;LC_COLLATE=en_U S.utf8;LC_ >> MONETARY=C;LC_MESSAGES=en_US.utf8;LC_PAPER=en_US.utf8;LC_NAME=C;LC_ ADDRESS=C; >> LC_TELEPHONE=C;LC_MEASUREMENT=en_US.utf8;LC_IDENTIFICATION=C >> >> attached base packages: >> [1] splines stats graphics utils datasets grDevices methods >> base >> >> other attached packages: >> [1] statmod_1.3.6 MASS_7.2-42 xtable_1.5-2 limma_2.14.2 >> lattice_0.17-10 >> [6] cairoDevice_2.8 >> >> loaded via a namespace (and not attached): >> [1] grid_2.7.1 tools_2.7.1 >> >> >> My search for differential expression seems pretty standard to me: >> >> MA$design <- modelMatrix(targets, ref="control") >> # flag out controls etc. >> MA$weights[MA$genes$Status != 'miRNA', ] = 0.0 >> # sort spots by ID to put replicates next to each other >> MA2 <- MA[order(MA$genes$ID), ] >> >> dupfit <- duplicateCorrelation(MA2, ndups=4) >> fit <- lmFit(MA2, ndups=4, correlation=dupfit$consensus) >> fit <- eBayes(fit) >> tt <- topTable(fit, number=100) >> >> I have siftet through the changelog of limma hoping to find a hint about >> some changed default or behaviour in lmFit or eBayes but saw nothing >> that seemed to expain my problem. >> >> Any hints apprechiated. >> >> cu >> Philipp >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor The contents of this e-mail are privileged and/or confid...{{dropped:5}}
ADD REPLY

Login before adding your answer.

Traffic: 494 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6