Unexpected results using limma with numerical factor - regression intercept?
0
0
Entering edit mode
@gordon-smyth
Last seen 7 minutes ago
WEHI, Melbourne, Australia
At 11:11 PM 26/08/2004, Matthew Hannah wrote: >Sorry for getting things confused, I'd like to ask a statistian but we >don't really have one:( >But hopefully I'm improving... > >My mistakes were - >1.I mixed up lmFit(limma) with lm.fit(stats) - sorry, 1 character's not >alot when you are in a hurry;) > >2.Being unsure of how lmFit handles factor c(1,2,3..) vs numeric >c(1,2,3..) and thinking this example in the Limma user guide meant -ve >numbers could be confused as dye swaps. > > design <- c(-1,1,-1,1) #"The negative numbers in the design matrix >indicate the dye-swaps." > > fit <- lmFit(MA,design) >sorry, no real excuse there, it just seemed logical when I typed >it...doh > >Anyway after much searching I *think* I've found out how to use limma >for (standard?) regression and might even be able to solve the original >posters query.... > >lm() and (I guess) lmFit have an implied intercept term, so the design >of the original posters fit > >design <- model.matrix(~-1 + TestScore1, pData(eset)); >removes the intercept and so forces the fit to go through zero?? For >linear regression against a quantified variable you need to allow an >intercept to be fitted (unless you have a reason to think it goes >through zero?)?? > >Maybe I'm wrong but here's my logic - >num <- c(1,2,3,4,5,6,7) #measured growth...etc >#Pearson and its p-value >cor(exprs(esetgcrma)[1,]~num) >cor.test(exprs(esetgcrma)[1,]~num) > >#lm - produces pearson squared, and same p-value as above, also produces >coefficients >lm(exprs(esetgcrma)[1,]~num) > >#limma - produces the same coefficients as lm() above >design <- model.matrix(~num) >fit <- lmFit(esetgcrma,design) > >So limma produces the 'same' results as pearson and lm(), but you can >carry them on for further analysis..... That's right. Remember that any function like lm() which accepts a formula, uses the formula to convert the data into response y and a design matrix X, and these quantities are then passed to lower-level functions. For example, lm() can accept a categorical factor which is converted into a numeric design matrix. Design matrices are always numeric. For efficiency and flexibility reasons, lmFit() assumes that that you have already formed your design matrix. For a numeric x, the standard design matrix is simply design <- cbind(1,x). The other difference between lmFit() and lm() is in the storage of the results. lmFit() outputs a streamlined data object for which memory usage is directly proportion to the number of genes and the number of coefficients. The output from lm() requires memory proportion to the _square_ of the number of coefficients, and this is prohibitive for a large model and a large number of genes. Gordon >So is this correct, or should I go back to wearing the dunce hat in the >corner? > >Cheers, >Matt
GO Regression limma convert GO Regression limma convert • 972 views
ADD COMMENT

Login before adding your answer.

Traffic: 716 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6