dChip v li.wong() (Was: Re: warnings from li wong summary method in expresso)

0

Entering edit mode

marco zucchelli ▴ 320

@marco-zucchelli-1987

Last seen 9.7 years ago

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070201/ 14730cc6/attachment.pl

• 663 views

ADD COMMENT • link updated 17.3 years ago by lgautier@altern.org ▴ 950 • written 17.3 years ago by marco zucchelli ▴ 320

0

Entering edit mode

lgautier@altern.org ▴ 950

@lgautieralternorg-747

Last seen 9.7 years ago

The implementation of the dChip-like pre-processing, split into the "invariantset" (normalization) and "liwong" (probe summary) methods, is indeed not-so-recent (more than 4 years, if I remember it correctly) and further development in dChip were not ported back to the affy package (AFAIK). One note I would like to add is that although the authors of dChip did not make the choice of releasing it as open source software, they have been very helpful in answering questions regarding their algorithms (my earliest memory on that is from before bioconductor). They can probably answer better than us on what has changed in dChip. Regards, Laurent > Dear Henrik, > > I am using > eset <- expresso(myaffy, bg.correct = > FALSE,normalize.method="invariantset", > pmcorrect.method ="pmonly",summary.method="liwong") > > I have downloaded the latest dChip (I think it is built in jan 2007). > > I haven't been looking at the source code, just att the affy vignette where > it is claimed that this choice of the expresso options > should mimic the MBEI method. > > But as Ben has posted, it might be that from the first time the MBEI alg. > was coded in R the dChip software may have been updated several times, which > can explain the difference in results. > > > Regards > > Marco > > > > > > > > > > On 1/31/07, Henrik Bengtsson <hb at="" stat.berkeley.edu=""> wrote: >> On 1/31/07, marco zucchelli <marco.bioc at="" gmail.com=""> wrote: >> > sorry I put the wrong dChip table. the correct one is the following >> > >> > T2-T1 T3-T2 T4-T3 T5-T4 T6-T5 counts >> > [1,] 0 0 0 0 0 47030 >> > [2,] 0 0 0 0 -1 2120 >> > [3,] 0 0 0 0 1 2096 >> > [4,] 0 0 -1 0 0 1114 >> > [5,] 0 0 1 0 0 721 >> > [6,] 0 0 1 0 -1 381 >> > [7,] 0 0 -1 0 1 301 >> > [8,] 0 0 0 1 0 244 >> > [9,] 0 0 0 -1 0 155 >> > >> > There is large difference in patten 9 (double the counts in expresso) >> and in >> > pattern 2 (pattern 3 in the expresso table) of about 20%. >> > >> > James, >> > >> > I understand your point and I personally prefer open source >> software... >> > >> > I am using own arrays and arrays from public databases which have been >> > analyzed with dChip. Since I use R and I found different results, I am >> > trying to >> > understand where the differences come from and if this can affect the biology. >> You point is most important. It is actually not quite the case that dChip is a black box. Under "Source code and command line version" on http://biosun1.harvard.edu/complab/dchip/install.htm >> it says "the latest source codes of dChip are available on request [by sending an email]". Also, the source code for a version of dChip for "Linux/MPI", which I assume has some in common with the Windows version, is available for download (see link on the above page). BTW, what kind of preprocessing do you apply in your comparison? For instance, both dChip and R/BioC provide quantile normalization but they use totally different algorithms (and model assumptions). To the best of my understanding (from browsing the Linux dChip code), dChip uses splines, whereas R/BioC uses sorting for quantile normalization, which in practice means that dChip fits a smoother function and when comparing empirical density functions of normalized probe signals they will not be identical across arrays whereas the R/BioC normalized ones will be. >> /Henrik >> > >> > I thought someone else could be intrested in sharing this information, >> since >> > other persons >> > may have found themselves in the same situation. >> > >> > Best Regards and thanks for your time ! >> > >> > >> > Marco >> > >> > >> > >> > >> > On 1/31/07, marco zucchelli <marco.bioc at="" gmail.com=""> wrote: >> > > >> > > Laurent, >> > > >> > > according to affy vignette and MBEI paper of Li and Wong >10 arrays >> > > should be enough. >> > > Anyway I tried to process my arrays with dChip and there I get no >> warnings >> > > at all. >> > > >> > > I also exported the expression values from dChip in a text file and >> loaded >> > > them into R. >> > > >> > > Even if expression values cannot be compared I applied the same >> LIMMA >> > > analysis to both the >> > > expression values form expresso and dChip and the results are pretty >> > > different. For example I tried 12 arrays in couples of 2 duplicates (i.e. 6 tissues) and from LIMMA I got the following up-down >> regulated >> > > patterns (1=up, -1=down). >> > > >> > > dCHIP >> > > >> > > T2-T1 T3-T2 T4-T3 T5-T4 T6-T5 counts >> > > [1,] 0 0 0 0 0 51742 [2,] 0 0 0 0 1 1199 >> > > [3,] 0 0 0 0 -1 616 >> > > [4,] 0 0 -1 0 0 607 >> > > [5,] 0 0 1 0 0 201 >> > > [6,] 0 0 1 0 -1 101 >> > > [7,] 0 0 0 1 -1 71 >> > > [8,] 0 0 -1 0 1 27 >> > > [9,] 0 0 0 1 0 20 >> > > .... >> > > .... >> > > >> > > expresso >> > > >> > > T2-T1 T3-T2 T4-T3 T5-T4 T6-T5 counts >> > > [1,] 0 0 0 0 0 46464 [2,] 0 0 0 0 1 2299 >> > > [3,] 0 0 0 0 -1 1715 >> > > [4,] 0 0 -1 0 0 1164 >> > > [5,] 0 0 1 0 0 691 >> > > [6,] 0 0 0 1 0 409 >> > > [7,] 0 0 -1 0 1 344 >> > > [8,] 0 0 1 0 -1 340 >> > > [9,] 0 0 0 -1 0 316 >> > > .... >> > > .... >> > > .... >> > > >> > > >> > > I would have expected smaller differences, or am I out fishing ? >> > > >> > > I wonder you or someone else on the mailing list has any experience >> of >> > > this ... >> > > >> > > >> > > Marco >> > > >> > > >> > > >> > > >> > > >> > > >> > > On 1/31/07, lgautier at altern.org <lgautier at="" altern.org=""> wrote: >> > > > >> > > > Marco, >> > > > >> > > > >> > > > If I remember correctly, dChip authors were talking more of having >> > > > at least 25 chips. Obviously the number either 10, 16, 25 is not a hard threshold, as the convergence depends on the very numerical >> > > > values. >> > > > >> > > > Knowing what are the probes failing is possible. A quick-but-dirty >> > > > way is to only edit the function >> > > > "generateExprVal.method.liwong ". >> > > > >> > > > Try: >> > > > generateExprVal.method.liwong2 <- >> edit(generateExprVal.method.liwong >> ) >> > > > >> > > > and edit the code as: >> > > > >> > > > probes <- t(probes) >> > > > if (ncol(probes) == 1) { >> > > > warning("method liwong unsuitable when only one probe pair") list(exprs=as.vector(probes),se.exprs=rep(NA,length(probes))) >> > > > } >> > > > else { >> > > > tmp <- fit.li.wong(probes, ...) >> > > > if ( !tmp$convergence1 & !tmp$convergence2) { >> > > > id <- get("id", envir= parent.frame(3)) >> > > > print(id) >> > > > } >> > > > list(exprs=tmp$theta,se.exprs=tmp$sigma.theta) >> > > > } >> > > > >> > > > >> > > > (the only change is near the end). >> > > > >> > > > Now you can use the summary method "liwong2" in place of "liwong". >> > > > >> > > > You can hack this to your specific need (and want to store the >> 'id' >> > > > into a variable in you global workspace for example). >> > > > >> > > > >> > > > >> > > > Hoping this helps, >> > > > >> > > > >> > > > Laurent >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > > James, >> > > > > >> > > > > I am actually using 16 hgu133plus2 arrays, so I find it a >> little >> > > > strange. >> > > > > >> > > > > Is there any way to know which probes failed (and how many >> totally) ? >> > > > > >> > > > > Does anybody know if dChip is still freely available? Seems like >> there >> > > > is >> > > > > no >> > > > > link anymore >> > > > > on the home page... I would like to cross check if the warnings >> are >> > > > coming >> > > > > up there as well >> > > > > >> > > > > Cherrs >> > > > > >> > > > > Marco >> > > > > >> > > > > >> > > > > On 1/25/07, James W. MacDonald < jmacdon at med.umich.edu> wrote: >> > > > >> >> > > > >> Hi Marco, >> > > > >> >> > > > >> marco zucchelli wrote: >> > > > >> > Hi, >> > > > >> > >> > > > >> > I am using in R the dChip method to normalize and summarize >> my >> > > > micro >> > > > >> > arrays. >> > > > >> > I tried several times and I always get warnings. What this >> does >> > > > mean? >> > > > >> Are >> > > > >> > the expression levels >> > > > >> > returned reliable anyway ? >> > > > >> >> > > > >> If you don't have enough samples, the LiWong method won't >> converge >> > > > for >> > > > >> some of your probesets. Lack of convergence is usually not a >> good >> > > > thing. >> > > > >> I think the recommendation for using LiWong is to have at least >> 10 or >> > > > 15 >> > > > >> samples. >> > > > >> >> > > > >> You might consider using a different method to summarize your >> data. >> > > > >> >> > > > >> Best, >> > > > >> >> > > > >> Jim >> > > > >> >> > > > >> >> > > > >> > >> > > > >> > I use R2.4.1 on linux redhat >> > > > >> > >> > > > >> > Marco >> > > > >> > >> > > > >> > >> > > > >> > eset <- expresso(hum.brain.embryo, bg.correct = >> > > > >> > FALSE,normalize.method="invariantset", >> > > > >> > pmcorrect.method = "pmonly",summary.method="liwong") normalization: invariantset >> > > > >> > PM/MM correction : pmonly >> > > > >> > expression values: liwong >> > > > >> > normalizing...done. >> > > > >> > 54675 ids to be processed >> > > > >> > | | >> > > > >> > |####################| >> > > > >> > There were 50 or more warnings (use warnings() to see the >> first >> 50) >> > > > >> > > > >> > >> > > > >> > Warning messages: >> > > > >> > 1: No convergence achieved in outlier loop >> > > > >> > in: fit.li.wong(probes, ...) >> > > > >> > 2: No convergence achieved in outlier loop >> > > > >> > in: fit.li.wong(probes, ...) >> > > > >> > 3: No convergence achieved in outlier loop >> > > > >> > in: fit.li.wong(probes, ...) >> > > > >> > 4: No convergence achieved in outlier loop >> > > > >> > in: fit.li.wong(probes, ...) >> > > > >> > >> > > > >> > [[alternative HTML version deleted]] >> > > > >> > >> > > > >> > _______________________________________________ >> > > > >> > Bioconductor mailing list >> > > > >> > Bioconductor at stat.math.ethz.ch >> > > > >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > > > >> > Search the archives: >> > > > >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > >> >> > > > >> >> > > > >> -- >> > > > >> James W. MacDonald, M.S. >> > > > >> Biostatistician >> > > > >> Affymetrix and cDNA Microarray Core >> > > > >> University of Michigan Cancer Center >> > > > >> 1500 E. Medical Center Drive >> > > > >> 7410 CCGC >> > > > >> Ann Arbor MI 48109 >> > > > >> 734-647-5623 >> > > > >> >> > > > >> >> > > > >> ********************************************************** Electronic Mail is not secure, may not be read every day, and >> should >> > > > not >> > > > >> be used for urgent or sensitive issues. >> > > > >> >> > > > > >> > > > > [[alternative HTML version deleted]] >> > > > > >> > > > > _______________________________________________ >> > > > > Bioconductor mailing list >> > > > > Bioconductor at stat.math.ethz.ch >> > > > > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > > > > Search the archives: >> > > > > http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > >> > > > >> > > > >> > > >> > >> > [[alternative HTML version deleted]] >> > >> > _______________________________________________ >> > Bioconductor mailing list >> > Bioconductor at stat.math.ethz.ch >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > !DSPAM:45c1c5e550621042850563! > > >

ADD COMMENT • link 17.3 years ago lgautier@altern.org ▴ 950

Login before adding your answer.