Question

Unexpected results using limma with numerical factor - regression intercept?

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 7 minutes ago

WEHI, Melbourne, Australia

At 11:11 PM 26/08/2004, Matthew Hannah wrote: >Sorry for getting things confused, I'd like to ask a statistian but we >don't really have one:( >But hopefully I'm improving... > >My mistakes were - >1.I mixed up lmFit(limma) with lm.fit(stats) - sorry, 1 character's not >alot when you are in a hurry;) > >2.Being unsure of how lmFit handles factor c(1,2,3..) vs numeric >c(1,2,3..) and thinking this example in the Limma user guide meant -ve >numbers could be confused as dye swaps. > > design <- c(-1,1,-1,1) #"The negative numbers in the design matrix >indicate the dye-swaps." > > fit <- lmFit(MA,design) >sorry, no real excuse there, it just seemed logical when I typed >it...doh > >Anyway after much searching I *think* I've found out how to use limma >for (standard?) regression and might even be able to solve the original >posters query.... > >lm() and (I guess) lmFit have an implied intercept term, so the design >of the original posters fit > >design <- model.matrix(~-1 + TestScore1, pData(eset)); >removes the intercept and so forces the fit to go through zero?? For >linear regression against a quantified variable you need to allow an >intercept to be fitted (unless you have a reason to think it goes >through zero?)?? > >Maybe I'm wrong but here's my logic - >num <- c(1,2,3,4,5,6,7) #measured growth...etc >#Pearson and its p-value >cor(exprs(esetgcrma)[1,]~num) >cor.test(exprs(esetgcrma)[1,]~num) > >#lm - produces pearson squared, and same p-value as above, also produces >coefficients >lm(exprs(esetgcrma)[1,]~num) > >#limma - produces the same coefficients as lm() above >design <- model.matrix(~num) >fit <- lmFit(esetgcrma,design) > >So limma produces the 'same' results as pearson and lm(), but you can >carry them on for further analysis..... That's right. Remember that any function like lm() which accepts a formula, uses the formula to convert the data into response y and a design matrix X, and these quantities are then passed to lower-level functions. For example, lm() can accept a categorical factor which is converted into a numeric design matrix. Design matrices are always numeric. For efficiency and flexibility reasons, lmFit() assumes that that you have already formed your design matrix. For a numeric x, the standard design matrix is simply design <- cbind(1,x). The other difference between lmFit() and lm() is in the storage of the results. lmFit() outputs a streamlined data object for which memory usage is directly proportion to the number of genes and the number of coefficients. The output from lm() requires memory proportion to the _square_ of the number of coefficients, and this is prohibitive for a large model and a large number of genes. Gordon >So is this correct, or should I go back to wearing the dunce hat in the >corner? > >Cheers, >Matt

GO Regression limma convert GO Regression limma convert • 972 views

ADD COMMENT • link 19.7 years ago Gordon Smyth 50k