Question

limma's eBayes error: No residual degrees of freedom in linear model

0

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 13 hours ago

WEHI, Melbourne, Australia

>[BioC] limma's eBayes error: No residual degrees of freedom in linear model >Li,Qinghong,ST.LOUIS,Molecular Biology Qinghong.Li at rdmo.nestle.com >Tue Nov 15 22:09:13 CET 2005 > >Hi BioC, > >I was runing eBayes and got the above error. I searched the old archives >of BioC, and has found similar problem poseted by Ken Ninh: >http://files.protsuggest.org/biocond/html/4652.html > >I checked the summary(fit$df.residual), all zero's. But the >fit1<-lmFit(normData, design) and fit2<-contrasts.fit(fit1, cont.matrix) >ran properly. I checked normData with boxplots, and they looked fine and >well normalized. Here is my design matrix: > > design > C.M.1 C.F.2 C.M.3 C.F.3 R.M.1 R.F.2 R.M.3 R.F.3 (C/R: > control/treatment; F/M: male/female; 1,2,3 are sibling pairs) >1 1 0 0 0 0 0 0 0 >2 0 1 0 0 0 0 0 0 >3 0 0 1 0 0 0 0 0 >4 0 0 0 1 0 0 0 0 >5 0 0 0 0 1 0 0 0 >6 0 0 0 0 0 1 0 0 >7 0 0 0 0 0 0 1 0 >8 0 0 0 0 0 0 0 1 >attr(,"assign") >[1] 1 1 1 1 1 1 1 1 >attr(,"contrasts") >attr(,"contrasts")$TBS >[1] "contr.treatment" > >contrast matrix > > > cont.matrix > Diff >C.M.1 -1 >C.F.2 -1 >C.M.3 -1 >C.F.3 -1 >R.M.1 1 >R.F.2 1 >R.M.3 1 >R.F.3 1 > >What could be the possible reasons for the error and how to fix that? > >Thanks >Johnny Dear Johnny, I have to tell you that what you're doing, i.e., the design matrix you've created, is not very sensible statistically. Hence the non-useful results you are getting from limma. Here are some steps that you can take to do something about it: 1. Consult someone with statistical experience at your organization who can tell you about replication and degrees of freedom for error. 2. To get meaningful help from this list, you need to explain a little more about your experiment. In particular you need to explain what you are hoping to learn scientifically from your data and what comparisons are of interest to you. 3. Explain what documentation you have read and what examples you are attempting to follow here. That would help us understand what you need to know, and may also help us to improve the documentation. Best wishes Gordon

limma limma • 1.1k views

ADD COMMENT • link 20.1 years ago Gordon Smyth 53k

score 0 · Answer 1 · 2005-11-19

>[BioC] limma's eBayes error: No residual degrees of freedom in linear >model >Li,Qinghong,ST.LOUIS,Molecular Biology Qinghong.Li at rdmo.nestle.com >Thu Nov 17 21:40:38 CET 2005 > >Dear Gordon, > >I would like to thank you for pointing out the problem. This is the >first time I tried to use Limma. The main reference materials I used >is the Ch. 23 of Book Bioinformatics and Comp. Biol. solutions Using R >and BioC. and the lab notes from microarray short course @ IBC 2004. >In particular, the example I was following was the 23.10 in the book, >factorial designs where we have five chips, 2 for WT and 3 for >Mutants. In each genotype, there are unstimulated and stimulated. I >thought that resembled the experimental designs in my case (target >file): > >FileName Sib Sex Treatment >anim1 1 M C >anim2 2 F C >anim3 3 M C >anim4 3 F C >anim5 1 M R >anim6 2 F R >anim7 3 M R >anim8 3 F R > >Where Sib indicates sibling pairs: anim1 and anim5 are siblings and so >forth. My question is quite simple: I would like to know if there is >any difference between C and R in treatment for now. Although I might >be interested in the gender effect (and/or gender*treatment) in a >later time. > >First, I read in all CEL files and normalized the chips using Limma >package and it looked quite good in diagnostic plots. Say I called >that file "eset", which is an exprSet file. I used the following >scripts to create the design matrix: > >> TBS<-paste(target$Treatment, target$Sex, target$Sib, sep=".") >> TBS<-factor(TBS, levels=unique(TBS)) >> design<-model.matrix(~0+TBS) >> colnames(design)<-levels(TBS) > >>cont.matrix<-makeContrasts(diff= >(C.M.1+C.F.2+C.M.3+C.F.3)-(R.M.1+R.F.2+R.M.3+R.F.3)) > >model fitting: >>fit1<-lmFit(eset,design) >>fit2<-contrasts.fit(fit1, cont.matrix) >>fit3<-eBayes(fit2) (this is where I got the error message) > >Best wishes, >Johnny > >-----Original Message----- >From: Gordon Smyth [mailto:smyth at wehi.edu.au] >Sent: Wednesday, November 16, 2005 6:51 PM >To: Li at wehi.edu.au; Qinghong at wehi.edu.au; ST.LOUIS at >wehi.edu.au; >Li,Qinghong,ST.LOUIS,Molecular Biology >Cc: BioC Mailing List >Subject: [BioC] limma's eBayes error: No residual degrees of freedom >in >linear model > > > >>[BioC] limma's eBayes error: No residual degrees of freedom in linear >model >>Li,Qinghong,ST.LOUIS,Molecular Biology Qinghong.Li at rdmo.nestle.com >>Tue Nov 15 22:09:13 CET 2005 >> >>Hi BioC, >> >>I was runing eBayes and got the above error. I searched the old >archives >>of BioC, and has found similar problem poseted by Ken Ninh: >>http://files.protsuggest.org/biocond/html/4652.html >> >>I checked the summary(fit$df.residual), all zero's. But the >>fit1<-lmFit(normData, design) and fit2<-contrasts.fit(fit1, >cont.matrix) >>ran properly. I checked normData with boxplots, and they looked fine >and >>well normalized. Here is my design matrix: >> > design >> C.M.1 C.F.2 C.M.3 C.F.3 R.M.1 R.F.2 R.M.3 R.F.3 (C/R: >> control/treatment; F/M: male/female; 1,2,3 are sibling pairs) >>1 1 0 0 0 0 0 0 0 >>2 0 1 0 0 0 0 0 0 >>3 0 0 1 0 0 0 0 0 >>4 0 0 0 1 0 0 0 0 >>5 0 0 0 0 1 0 0 0 >>6 0 0 0 0 0 1 0 0 >>7 0 0 0 0 0 0 1 0 >>8 0 0 0 0 0 0 0 1 >>attr(,"assign") >>[1] 1 1 1 1 1 1 1 1 >>attr(,"contrasts") >>attr(,"contrasts")$TBS >>[1] "contr.treatment" >> >>contrast matrix >> >> > cont.matrix >> Diff >>C.M.1 -1 >>C.F.2 -1 >>C.M.3 -1 >>C.F.3 -1 >>R.M.1 1 >>R.F.2 1 >>R.M.3 1 >>R.F.3 1 >> >>What could be the possible reasons for the error and how to fix that? >> >>Thanks >>Johnny > >Dear Johnny, > >I have to tell you that what you're doing, i.e., the design matrix >you've created, is not very sensible statistically. Hence the >non-useful results you are getting from limma. Here are some steps >that you can take to do something about it: > >1. Consult someone with statistical experience at your organization >who can tell you about replication and degrees of freedom for error. > >2. To get meaningful help from this list, you need to explain a little >more about your experiment. In particular you need to explain what you >are hoping to learn scientifically from your data and what comparisons >are of interest to you. > >3. Explain what documentation you have read and what examples you are >attempting to follow here. That would help us understand what you need >to know, and may also help us to improve the documentation. > >Best wishes >Gordon Dear Johnny, Thanks for supplying the extra information about your experiment and about the documentation you are trying to follow. I'm sorry though that you're still not using statistical advice from your own group. I note that Naomi Altman has already given you sensible advice on your experiment: https://www.stat.math.ethz.ch/pipermail/bioconductor/2005-November/010 773.html Have you tried to follow her advice? Following Naomi's suggestion, you could use the block argument in lmFit() as in Section 8.2 of the Limma User's Guide: design <- model.matrix(~factor(Treatment), data=targets) corfit <- duplicateCorrelation(eset,design,block=targets$Sib) corfit$consensus fit <- lmFit(eset,design,block=targets$Sib,correlation=corfit$consensus) fit <- eBayes(fit) topTable(fit,coef=2) This tests for a treatment difference while treating the Sib pairs as random effects. A somewhat simpler approach would be compute a moderated paired t-test: design <- model.matrix(~factor(Sib)+factor(Treatment), data=targets) fit <- lmFit(set, design) fit <- eBayes(fit) topTable(fit,coef=4) Thanks for pointing out that you tried to follow the factorial example. This does resemble your example in a way, but there is a critical difference. In the factorial design, the interaction terms is of principle interest. In your example, the Sib pairs are a blocking variable, not a treatment factor, and you want to ignore the interaction. Also, you didn't actually follow the factorial example. The factorial example explains how to fit a model with four coefficients, because there are four unique treatment combinations. There are more than four arrays, so this leaves some degrees of freedom over to estimate the standard deviation. You however created a distinct treatment name for every single array and hence fitted a model with a coefficient for every array, leaving no residual degrees freedom. If you don't understand why this is wrong, you need to consult a statistician in your own group. Are you asking your statistician the right questions? Most statisticians would be happy to oblige if you asked them a well posed question like "I've fitted a linear model with no residual degrees of freedom--what does this mean" or "how do I use a linear model program to calculate a paired t-test". They don't need to know anything about microarrays to help you with things like this. Hope this helps Gordon

score 0 · Answer 2 · 2005-11-21

>[BioC] limma's eBayes error: No residual degrees of freedom in linear >model >Li,Qinghong,ST.LOUIS,Molecular Biology Qinghong.Li at rdmo.nestle.com >Thu Nov 17 21:40:38 CET 2005 > >Dear Gordon, > >I would like to thank you for pointing out the problem. This is the >first time I tried to use Limma. The main reference materials I used >is the Ch. 23 of Book Bioinformatics and Comp. Biol. solutions Using R >and BioC. and the lab notes from microarray short course @ IBC 2004. >In particular, the example I was following was the 23.10 in the book, >factorial designs where we have five chips, 2 for WT and 3 for >Mutants. In each genotype, there are unstimulated and stimulated. I >thought that resembled the experimental designs in my case (target >file): > >FileName Sib Sex Treatment >anim1 1 M C >anim2 2 F C >anim3 3 M C >anim4 3 F C >anim5 1 M R >anim6 2 F R >anim7 3 M R >anim8 3 F R > >Where Sib indicates sibling pairs: anim1 and anim5 are siblings and so >forth. My question is quite simple: I would like to know if there is >any difference between C and R in treatment for now. Although I might >be interested in the gender effect (and/or gender*treatment) in a >later time. > >First, I read in all CEL files and normalized the chips using Limma >package and it looked quite good in diagnostic plots. Say I called >that file "eset", which is an exprSet file. I used the following >scripts to create the design matrix: > >> TBS<-paste(target$Treatment, target$Sex, target$Sib, sep=".") >> TBS<-factor(TBS, levels=unique(TBS)) >> design<-model.matrix(~0+TBS) >> colnames(design)<-levels(TBS) > >>cont.matrix<-makeContrasts(diff= >(C.M.1+C.F.2+C.M.3+C.F.3)-(R.M.1+R.F.2+R.M.3+R.F.3)) > >model fitting: >>fit1<-lmFit(eset,design) >>fit2<-contrasts.fit(fit1, cont.matrix) >>fit3<-eBayes(fit2) (this is where I got the error message) > >Best wishes, >Johnny Thanks for your detailed explanation here which makes what you're doing far clearer. I've just realised what the reasoning was behind your analysis, so I'll add a few more comments to my previous email. I see that you're trying to extend the two-factor example in the book chapter to three factors. You can't do this because you don't have enough data. Your 3-way factorial model would generate 3 x 2 x 2 = 12 factor combinations, so you would need at least 13 arrays whereas you only have 8. There actually is no need to fit a 3-way factorial model. Sib pairs are not a factor but rather a blocking variable. You do not need to allow for interactions between Sib pairs and the other two factors. Best wishes Gordon