Question

Design and not estimable coefficients

0

Entering edit mode

Dave Canvhet ▴ 40

@dave-canvhet-5253

Last seen 9.6 years ago

Dear all, I have the following target target filename mutation exp sample1.cel wt 1 sample2.cel wt 1 sample3.cel mu 1 sample4.cel mu 1 sample5.cel wt 2 sample6.cel mu 2 sample7.cel mu 2 sample8.cel mu 2 I'm intersting in mu vs wt, but taking into account the exp factor (which seems having most impact on signal as reveal by a PCA : PC1 seems to be assoiated to exp) So I've set the following design so as to model the resulting signal as a combination of these two factor: design = matrix(0, ncol=4, nrow = 8) design[which(target$mutation == "wt"),1] = 1 design[which(target$mutation == "mu"),2] = 1 design[which(target$exp == 1),3] = 1 design[which(target$exp == 2),4] = 1 colnames(design) = c("wt","mu","exp1","exp2") design wt mu exp1 exp2 [1,] 1 0 1 0 [2,] 1 0 1 0 [3,] 0 1 1 0 [4,] 0 1 1 0 [5,] 1 0 0 1 [6,] 0 1 0 1 [7,] 0 1 0 1 [8,] 0 1 0 1 fit = lmFit(x, design) but it failed (even before doinf any contrast matrix), raising the following error : Coefficients not estimable: exp2 Can you please tell me what is wrong is such design ? SO I've used the approach described in Limma User guide : f <- paste(target$mutation,target$exp,sep="") f <- factor(f) [1] wt1 wt1 mu1 mu1 wt2 mu2 mu2 mu2 Levels: mu1 mu2 wt1 wt2 design <- model.matrix(~0+f) colnames(design) <- levels(f) design mu1 mu2 wt1 wt2 1 0 0 1 0 2 0 0 1 0 3 1 0 0 0 4 1 0 0 0 5 0 0 0 1 6 0 1 0 0 7 0 1 0 0 8 0 1 0 0 attr(,"assign") [1] 1 1 1 1 attr(,"contrasts") attr(,"contrasts")$f [1] "contr.treatment" using such a design matrix, can I use correctly the following contrast matrix to get the gene differentially expressed between mutant and wild type ? cont.matrix <- makeContrasts(mu_vs_wt = (mu2+mu1) - (wt2+wt1),levels=design) many thanks for you answer. == Dave Canvhet [[alternative HTML version deleted]]

limma limma • 1.6k views

ADD COMMENT • link updated 12.0 years ago by Gordon Smyth 50k • written 12.0 years ago by Dave Canvhet ▴ 40

score 0 · Answer 1 · 2012-04-26

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 4 minutes ago

WEHI, Melbourne, Australia

Dear Dave, This is a common type of design, with two groups and a batch effect. To find genes DE in mu vs wt adjusting for exp differences: mutation <- factor(target$mutation) mutation <- relevel(mutation, ref="wt") exp <- factor(target$exp) design <- model.matrix(~exp+mutation) fit <- lmFit(x,design) fit <- eBayes(fit) topTable(fit, coef=3) Note that the design matrix has only 3 columns, not four as yours do. The batch effect requires only one extra column. Your design matrix has a redundant column because you've defined the intercept term twice. The first and second columns (wt and mu) add up to the intercept column, and so do the 3rd and 4th columns (exp1 and exp2). You say that limma failed on your design matrix, but really it didn't. It correctly removed the redundant column from your design matrix, and gave you a warning to let you know. Best wishes Gordon ---------------- original message ---------------- [BioC] Design and not estimable coefficients Dave Canvhet dcanvhet at gmail.com Wed Apr 25 18:08:59 CEST 2012 Dear all, I have the following target target filename mutation exp sample1.cel wt 1 sample2.cel wt 1 sample3.cel mu 1 sample4.cel mu 1 sample5.cel wt 2 sample6.cel mu 2 sample7.cel mu 2 sample8.cel mu 2 I'm intersting in mu vs wt, but taking into account the exp factor (which seems having most impact on signal as reveal by a PCA : PC1 seems to be assoiated to exp) So I've set the following design so as to model the resulting signal as a combination of these two factor: design = matrix(0, ncol=4, nrow = 8) design[which(target$mutation == "wt"),1] = 1 design[which(target$mutation == "mu"),2] = 1 design[which(target$exp == 1),3] = 1 design[which(target$exp == 2),4] = 1 colnames(design) = c("wt","mu","exp1","exp2") design wt mu exp1 exp2 [1,] 1 0 1 0 [2,] 1 0 1 0 [3,] 0 1 1 0 [4,] 0 1 1 0 [5,] 1 0 0 1 [6,] 0 1 0 1 [7,] 0 1 0 1 [8,] 0 1 0 1 fit = lmFit(x, design) but it failed (even before doinf any contrast matrix), raising the following error : Coefficients not estimable: exp2 Can you please tell me what is wrong is such design ? SO I've used the approach described in Limma User guide : f <- paste(target$mutation,target$exp,sep="") f <- factor(f) [1] wt1 wt1 mu1 mu1 wt2 mu2 mu2 mu2 Levels: mu1 mu2 wt1 wt2 design <- model.matrix(~0+f) colnames(design) <- levels(f) design mu1 mu2 wt1 wt2 1 0 0 1 0 2 0 0 1 0 3 1 0 0 0 4 1 0 0 0 5 0 0 0 1 6 0 1 0 0 7 0 1 0 0 8 0 1 0 0 attr(,"assign") [1] 1 1 1 1 attr(,"contrasts") attr(,"contrasts")$f [1] "contr.treatment" using such a design matrix, can I use correctly the following contrast matrix to get the gene differentially expressed between mutant and wild type ? cont.matrix <- makeContrasts(mu_vs_wt = (mu2+mu1) - (wt2+wt1),levels=design) many thanks for you answer. == Dave Canvhet ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

ADD COMMENT • link 12.0 years ago Gordon Smyth 50k

0

Entering edit mode

Dear List members My targets include 31 samples representing two groups (south and north) and also four populations (FBK, NWL, SOU, CAR). > targets <- read.delim(file = "Targets.txt", stringsAsFactors = FALSE) > targets files group pop 1 FBK01.txt N FBK 2 FBK04.txt N FBK 3 FBK05.txt N FBK 4 FBK08.txt N FBK 5 FBK09.txt N FBK 6 FBK10.txt N FBK 7 FBK13.txt N FBK 8 NWL02.txt N NWL 9 NWL04.txt N NWL 10 NWL06.txt N NWL 11 NWL08.txt N NWL 12 NWL09.txt N NWL 13 NWL10.txt N NWL 14 NWL11.txt N NWL 15 NWL14.txt N NWL 16 SOU01.txt S SOU 17 SOU02.txt S SOU 18 SOU03.txt S SOU 19 SOU04.txt S SOU 20 SOU07.txt S SOU 21 SOU09.txt S SOU 22 SOU11.txt S SOU 23 SOU15.txt S SOU 24 CAR01.txt S CAR 25 CAR02.txt S CAR 26 CAR03.txt S CAR 27 CAR04.txt S CAR 28 CAR05.txt S CAR 29 CAR11.txt S CAR 30 CAR12.txt S CAR 31 CAR14.txt S CAR We are interested with the differential expression between north and south group by adjusting for differences between populations. > latitude <- factor(targets$group) > latitude <- relevel(latitude, ref="N") > pop <- factor(targets$pop) > design <- model.matrix(~pop+latitude) > rownames(design) <- rownames(d$samples) > design (Intercept) popFBK popNWL popSOU latitudeS 1 1 1 0 0 0 2 1 1 0 0 0 3 1 1 0 0 0 4 1 1 0 0 0 5 1 1 0 0 0 6 1 1 0 0 0 7 1 1 0 0 0 8 1 0 1 0 0 9 1 0 1 0 0 10 1 0 1 0 0 11 1 0 1 0 0 12 1 0 1 0 0 13 1 0 1 0 0 14 1 0 1 0 0 15 1 0 1 0 0 16 1 0 0 1 1 17 1 0 0 1 1 18 1 0 0 1 1 19 1 0 0 1 1 20 1 0 0 1 1 21 1 0 0 1 1 22 1 0 0 1 1 23 1 0 0 1 1 24 1 0 0 0 1 25 1 0 0 0 1 26 1 0 0 0 1 27 1 0 0 0 1 28 1 0 0 0 1 29 1 0 0 0 1 30 1 0 0 0 1 31 1 0 0 0 1 attr(,"assign") [1] 0 1 1 1 2 attr(,"contrasts") attr(,"contrasts")$pop [1] "contr.treatment" attr(,"contrasts")$latitude [1] "contr.treatment" I am wondering maybe there is something wrong with my design matrix. So, when I tried to estimate the dispersions, I got the following errors. > d <- estimateGLMCommonDisp(d, design) Error in solve.default(R, t(beta)) : system is computationally singular: reciprocal condition number = 5.16368e-17 The following is session information. Any comments are very appreciated. > sessionInfo() R version 2.15.0 (2012-03-30) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] qvalue_1.30.0 edgeR_2.6.0 limma_3.12.0 loaded via a namespace (and not attached): [1] tcltk_2.15.0 tools_2.15.0 Thanks in advance! Best wishes Li

ADD REPLY • link 12.0 years ago Wang, Li ▴ 180

0

Entering edit mode

Dear Gordon, many thanks for your answer !! Clearly, I need to get some more information about the Intercept terms, which is probably a classical parameters in statistics and linear modeling (whom I'm not familiar with ...). Also, Could you please give me your opinion using the secodn way of designing the design matrix summarized below ? f <- paste(target$mutation,target$**exp,sep="") design <- model.matrix(~0+f) colnames(design) <- levels(f) using such a design matrix, can I use correctly the following contrast matrix to get the gene differentially expressed between mutant and wild type ? cont.matrix <- makeContrasts(mu_vs_wt = (mu2+mu1) - (wt2+wt1),levels=design) again many thanks for your time -- Dave 2012/4/26 Gordon K Smyth <smyth@wehi.edu.au> > Dear Dave, > > This is a common type of design, with two groups and a batch effect. To > find genes DE in mu vs wt adjusting for exp differences: > > mutation <- factor(target$mutation) > mutation <- relevel(mutation, ref="wt") > exp <- factor(target$exp) > design <- model.matrix(~exp+mutation) > fit <- lmFit(x,design) > fit <- eBayes(fit) > topTable(fit, coef=3) > > Note that the design matrix has only 3 columns, not four as yours do. The > batch effect requires only one extra column. > > Your design matrix has a redundant column because you've defined the > intercept term twice. The first and second columns (wt and mu) add up to > the intercept column, and so do the 3rd and 4th columns (exp1 and exp2). > > You say that limma failed on your design matrix, but really it didn't. It > correctly removed the redundant column from your design matrix, and gave > you a warning to let you know. > > Best wishes > Gordon > > ---------------- original message ---------------- > [BioC] Design and not estimable coefficients > Dave Canvhet dcanvhet at gmail.com > Wed Apr 25 18:08:59 CEST 2012 > > Dear all, > > I have the following target > > target > > filename mutation exp > sample1.cel wt 1 > sample2.cel wt 1 > sample3.cel mu 1 > sample4.cel mu 1 > sample5.cel wt 2 > sample6.cel mu 2 > sample7.cel mu 2 > sample8.cel mu 2 > > > I'm intersting in mu vs wt, but taking into account the exp factor (which > seems having most impact on signal as reveal by a PCA : PC1 seems to be > assoiated to exp) > > So I've set the following design so as to model the resulting signal as a > combination of these two factor: > > design = matrix(0, ncol=4, nrow = 8) > design[which(target$mutation == "wt"),1] = 1 > design[which(target$mutation == "mu"),2] = 1 > design[which(target$exp == 1),3] = 1 > design[which(target$exp == 2),4] = 1 > colnames(design) = c("wt","mu","exp1","exp2") > > > design > wt mu exp1 exp2 > [1,] 1 0 1 0 > [2,] 1 0 1 0 > [3,] 0 1 1 0 > [4,] 0 1 1 0 > [5,] 1 0 0 1 > [6,] 0 1 0 1 > [7,] 0 1 0 1 > [8,] 0 1 0 1 > > fit = lmFit(x, design) > but it failed (even before doinf any contrast matrix), raising the > following error : > Coefficients not estimable: exp2 > > Can you please tell me what is wrong is such design ? > > > SO I've used the approach described in Limma User guide : > f <- paste(target$mutation,target$**exp,sep="") > f <- factor(f) > [1] wt1 wt1 mu1 mu1 wt2 mu2 mu2 mu2 > Levels: mu1 mu2 wt1 wt2 > > > design <- model.matrix(~0+f) > colnames(design) <- levels(f) > design > > mu1 mu2 wt1 wt2 > 1 0 0 1 0 > 2 0 0 1 0 > 3 1 0 0 0 > 4 1 0 0 0 > 5 0 0 0 1 > 6 0 1 0 0 > 7 0 1 0 0 > 8 0 1 0 0 > attr(,"assign") > [1] 1 1 1 1 > attr(,"contrasts") > attr(,"contrasts")$f > [1] "contr.treatment" > > using such a design matrix, can I use correctly the following contrast > matrix to get the gene differentially expressed between mutant and wild > type ? > cont.matrix <- makeContrasts(mu_vs_wt = (mu2+mu1) - > (wt2+wt1),levels=design) > > > many thanks for you answer. > > == > Dave Canvhet > > ______________________________**______________________________**____ ______ > The information in this email is confidential and inte...{{dropped:10}}

ADD REPLY • link 12.0 years ago Dave Canvhet ▴ 40