OSX location of bioconductor functions

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 2 hours ago

WEHI, Melbourne, Australia

Dear Bobby, The reason why the column headings output from topTable() have been changed is to make them more self-explanatory. I've answered the question "what do M and A mean in the top table?" so many times over the last few years that I wanted to try to head off misunderstandings on the part of new users. I originally called the coefficient column output by topTable "M" by analogy with the M-values from two colour arrays. But this leads to the confusion that "M" in an MAList data object and "M" in the output from topTable are not the same thing. The term "M-value" correctly refers to the log-ratio of two channels, usually for a single two-colour array, whereas the coefficient column from topTable() is more general and is not necessarily a log-ratio. So I thought that "logFC" for log-fold-change would be clearer. Similary I though that "AveExpr" would be more informative than simply "A" to represent the average log-expression level of that probe. An alternative name would have been "Amean" because that's what the corresponding component of the fitted model object is called. I've kept changes to the limma API to an absolute minimum over the last 4 years, but I thought a change with toptable would be manageable because most people extending limma would program from the fitted model object directly rather than using the output from topTable. It's not too late to change these names back for the next Bioconductor release in a couple of months but I would want to hear quite a few more opinions before doing so. So far, yours is the only complaint. I agree that a change in the API in the current release of Bioconductor is undesirable. Actually I haven't changed the Bioconductor release version of limma, only the versions on BioC devel and on CRAN. These are where I test out changes for the next Bioconductor release. It would appear that getBioC() has installed for you the CRAN version of limma rather than the Bioconductor release version. I think I know why getBioC() does this, but I'm going to contact the Bioconductor maintainers to ask them to avoid this in the future. On the other hand, it isn't realistic to expect that the API will remain the same forever. The user interface for both R and the packages change gradually as the software is developed and improved. There have been many changes to R over the last four years that limma users have been insulated from but the package author has had to deal with! Kasper has already explained that you can't hack the installed version of an R package. Indeed that would be highly undesirable. The only way that you could change the "true" toptable would be download the source code for limma, make your own version of the package, then build and install it. You would then be responsible for maintaining your own personal version of the package forever after. I think that would be a very bad design decision :) Is it so important to you that the column names from toptable stay the same? Would it be difficult to use an editor do a global change of $M to $logFC in your script? Alternatively it is very easy to use names() to reset or edit the column names of a data.frame. You could redefine toptable on the fly by toptable <- function(...) { tab <- limma::toptable(...) names(tab)[names(tab) == "logFC"] <- "M" tab } Best wishes Gordon At 10:00 PM 9/03/2007, bioconductor-request at stat.math.ethz.ch wrote: >Date: Thu, 8 Mar 2007 18:24:06 -0500 >From: Bobby Prill <rprill at="" jhu.edu=""> >Subject: [BioC] OSX location of bioconductor functions >To: bioconductor at stat.math.ethz.ch > >Can someone tell me where toptable() code resides on an OSX >installation? > >I just upgraded my R and Bioconductor. Apparently, some rows were >renamed in toptable(). I would like to change the row called "logFC" >back to "M". > >I searched my system and can not find where any of the limma >functions are defined. Limma seems to be installed in: > >/Library/Frameworks/R.framework/Versions/2.4/Resources/library/limma/ > >However I do not see any actual functions defined there. > >I installed using the command: getBioC(). Perhaps that installed >"binary" rather than "source" version of limma? > >I don't want to create a new function called my.toptable(). I just >want all of my scripts to simply work again with a single edit of the >true toptable(). > >As an aside, I can't imagine why the column names in ANY function >would be changed. These column names are essentially the external >interface to the function. It seems like a very bad design decision >to just one day change "M" to "logFC." > >I appreciate the help. This is a great user community. > >- Bobby

limma limma • 873 views

ADD COMMENT • link 17.1 years ago Gordon Smyth 50k

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 2 hours ago

WEHI, Melbourne, Australia

Wish I'd read this exchange before making my own long-winded reply :) BTW, fit$Amean is always just a vector so the test ncol(fit$Ameans) shouldn't be necessary. Cheers Gordon >[BioC] OSX location of bioconductor functions >James W. MacDonald jmacdon at med.umich.edu >Fri Mar 9 18:58:02 CET 2007 > >Hi Bobby, > >Yes there is a much better way to get what you want. Note that the >MArrayLM object returned by eBayes() is just a list. The M values are >held in the 'coefficients' list member (or whatever the technical term >is), and the A values are in the 'Ameans' list member. > >So to get what you want, you would do this: > >M <- fit$coefficients[,1] ## we get coef = 1 >A <- if(ncol(fit$Ameans) > 1) rowMeans(fit$Ameans) else fit$Ameans > >Best, > >Jim > >Bobby Prill wrote: > > Jim, > > > > I understand what you're saying. My problem is more with the "R way of > > doing things," which takes some getting used to. > > > > Perhaps you can recommend a better way for me to do the following: I > > want the M and A values for every probe on a chip. This is what I do, > > which has always looked ugly to me: > > > > ## the number of probes > > N = dim(exprs(eset))[1] > > > > ## ask topTable to return for every probe > > T = topTable(fit, coef=1, adjust="fdr", n=N) > > > > ## put back in original probe order > > ord = order(as.integer(rownames(T))) > > T = T[ord,] > > > > T$A > > T$M > > > > There must be a better way? > > > > - Bobby

ADD COMMENT • link 17.1 years ago Gordon Smyth 50k

0

Entering edit mode

Gordon Smyth wrote: > Wish I'd read this exchange before making my own long-winded reply :) > > BTW, fit$Amean is always just a vector so the test ncol(fit$Ameans) > shouldn't be necessary. I cribbed that out of topTable() (I believe). I didn't think Ameans would ever be a matrix, but defensive programming and all that... Best, Jim > > Cheers > Gordon > >> [BioC] OSX location of bioconductor functions >> James W. MacDonald jmacdon at med.umich.edu >> Fri Mar 9 18:58:02 CET 2007 >> >> Hi Bobby, >> >> Yes there is a much better way to get what you want. Note that the >> MArrayLM object returned by eBayes() is just a list. The M values are >> held in the 'coefficients' list member (or whatever the technical term >> is), and the A values are in the 'Ameans' list member. >> >> So to get what you want, you would do this: >> >> M <- fit$coefficients[,1] ## we get coef = 1 >> A <- if(ncol(fit$Ameans) > 1) rowMeans(fit$Ameans) else fit$Ameans >> >> Best, >> >> Jim >> >> Bobby Prill wrote: >> > Jim, >> > >> > I understand what you're saying. My problem is more with the "R >> way of >> > doing things," which takes some getting used to. >> > >> > Perhaps you can recommend a better way for me to do the following: I >> > want the M and A values for every probe on a chip. This is what I do, >> > which has always looked ugly to me: >> > >> > ## the number of probes >> > N = dim(exprs(eset))[1] >> > >> > ## ask topTable to return for every probe >> > T = topTable(fit, coef=1, adjust="fdr", n=N) >> > >> > ## put back in original probe order >> > ord = order(as.integer(rownames(T))) >> > T = T[ord,] >> > >> > T$A >> > T$M >> > >> > There must be a better way? >> > >> > - Bobby > > -- James W. MacDonald University of Michigan Affymetrix and cDNA Microarray Core 1500 E Medical Center Drive Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.

ADD REPLY • link 17.1 years ago James W. MacDonald 65k

0

Entering edit mode

Gordon and Jim, Thanks for the education on Bioconductor and Limma. I'm very grateful that this software is actively developed, and works. I can see that most of my frustration stems from not knowing how to efficiently pull results out of the various R objects. - Bobby On Mar 10, 2007, at 4:24 PM, James W. MacDonald wrote: > Gordon Smyth wrote: >> Wish I'd read this exchange before making my own long-winded >> reply :) >> >> BTW, fit$Amean is always just a vector so the test ncol(fit$Ameans) >> shouldn't be necessary. > > I cribbed that out of topTable() (I believe). I didn't think Ameans > would ever be a matrix, but defensive programming and all that... > > Best, > > Jim > > >> >> Cheers >> Gordon >> >>> [BioC] OSX location of bioconductor functions >>> James W. MacDonald jmacdon at med.umich.edu >>> Fri Mar 9 18:58:02 CET 2007 >>> >>> Hi Bobby, >>> >>> Yes there is a much better way to get what you want. Note that the >>> MArrayLM object returned by eBayes() is just a list. The M values >>> are >>> held in the 'coefficients' list member (or whatever the technical >>> term >>> is), and the A values are in the 'Ameans' list member. >>> >>> So to get what you want, you would do this: >>> >>> M <- fit$coefficients[,1] ## we get coef = 1 >>> A <- if(ncol(fit$Ameans) > 1) rowMeans(fit$Ameans) else fit$Ameans >>> >>> Best, >>> >>> Jim >>> >>> Bobby Prill wrote: >>>> Jim, >>>> >>>> I understand what you're saying. My problem is more with the "R >>> way of >>>> doing things," which takes some getting used to. >>>> >>>> Perhaps you can recommend a better way for me to do the >>>> following: I >>>> want the M and A values for every probe on a chip. This is what >>>> I do, >>>> which has always looked ugly to me: >>>> >>>> ## the number of probes >>>> N = dim(exprs(eset))[1] >>>> >>>> ## ask topTable to return for every probe >>>> T = topTable(fit, coef=1, adjust="fdr", n=N) >>>> >>>> ## put back in original probe order >>>> ord = order(as.integer(rownames(T))) >>>> T = T[ord,] >>>> >>>> T$A >>>> T$M >>>> >>>> There must be a better way? >>>> >>>> - Bobby >> >> > > > -- > James W. MacDonald > University of Michigan > Affymetrix and cDNA Microarray Core > 1500 E Medical Center Drive > Ann Arbor MI 48109 > 734-647-5623 > > > > ********************************************************** > Electronic Mail is not secure, may not be read every day, and > should not be used for urgent or sensitive issues. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/ > gmane.science.biology.informatics.conductor

ADD REPLY • link 17.1 years ago Bobby Prill ▴ 60

Login before adding your answer.