LIMMA experimental design affecting modelmatrix

0

Entering edit mode

Aubin-Horth Nadia ▴ 60

@aubin-horth-nadia-3844

Last seen 9.6 years ago

We are analysing a cDNA microarray dataset in LIMMA with the following design and we run into "Coefficients not estimable" comments. R = 2.9.0 limma=2.18 We have two groups, "A" and "D" with 6 biological replicates each (indicated in the targets file). We are interested in significant gene expression differences between A and D on average. A given biological replicate of "A" was compared to a biological replicate of "D", with a dye-swap. A1=D1 A2=D2 A3=D3 A4=D4 A5=D5 A6=D6 Therefore, we have six mini-experiments that are not connected one to the other. After normalisation, we use teh following design with the goal of doing a contrast as shown below design <- modelMatrix(targets, ref="D1") design <- cbind(Dye=1, design) fit<-lmFit(MAptip.nba.scale,design) here we get: Coefficients not estimable: D5 A4 D2 D6 D3 And we checked with > is.fullrank(design) and we get: [1] FALSE Our question is, is our experimental design (non loop, non reference design) with samples not being directly compared on a microarray causing these non estimable coefficients? If so, is there a way that we can keep the information on biological replicates and still use LIMMA? This is the contrast we were planning to use (which of course does not work) cont.matrix <- makeContrasts(AvsD = (A1 + A2 + A3 + A4 + A5 + A6 - D2 - D3 - D4 - D5 - D6)/6, levels=design) fit2 <- contrasts.fit(fit, cont.matrix) Error in contrasts.fit(fit, cont.matrix) : trying to take contrast of non-estimable coefficient In addition: Warning message: In any(contrasts[-est, ]) : coercing argument of type 'double' to logical fit3 <- eBayes(fit2) Thank you Nadia Aubin-Horth Assistant professor Biology Department Institut de Biologie Int?grative et des Syst?mes Universit? Laval

Microarray limma a4 Microarray limma a4 • 1.3k views

ADD COMMENT • link updated 14.4 years ago by James W. MacDonald 65k • written 14.4 years ago by Aubin-Horth Nadia ▴ 60

0

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 1 hour ago

United States

Hi Nadia, Aubin-Horth Nadia wrote: > We are analysing a cDNA microarray dataset in LIMMA with the following > design and we run into "Coefficients not estimable" comments. > > R = 2.9.0 > limma=2.18 > > We have two groups, "A" and "D" with 6 biological replicates each > (indicated in the targets file). We are interested in significant gene > expression differences between A and D on average. A given biological > replicate of "A" was compared to a biological replicate of "D", with a > dye-swap. > A1=D1 > A2=D2 > A3=D3 > A4=D4 > A5=D5 > A6=D6 I don't see a dye-swap here. > > Therefore, we have six mini-experiments that are not connected one to > the other. > > After normalisation, we use teh following design with the goal of doing > a contrast as shown below > > design <- modelMatrix(targets, ref="D1") > design <- cbind(Dye=1, design) Without a dye-swap (and you don't indicate one above), you cannot estimate dye bias. All you can do is compare the two sample types. Since the comparison is intrinsic to the data, all you want to do is compute the mean of the ratios (or log ratios, actually) and then test to see if the mean is different from zero. In this very simple case, you don't have to create a design matrix, as limma will produce one for you. So you just do: fit <- lmFit(MAptip.nba.scale) fit2 <- eBayes(fit) topTable(fit2) FYI, what you are doing is to treat each sample as if it isn't replicated, which is why you are running into problems. Best, Jim > > fit<-lmFit(MAptip.nba.scale,design) > > here we get: > Coefficients not estimable: D5 A4 D2 D6 D3 > > And we checked with > > is.fullrank(design) > and we get: > [1] FALSE > > Our question is, is our experimental design (non loop, non reference > design) with samples not being directly compared on a microarray causing > these non estimable coefficients? If so, is there a way that we can keep > the information on biological replicates and still use LIMMA? > > This is the contrast we were planning to use (which of course does not > work) > cont.matrix <- makeContrasts(AvsD = (A1 + A2 + A3 + A4 + A5 + A6 > - D2 - D3 - D4 - D5 - D6)/6, levels=design) > fit2 <- contrasts.fit(fit, cont.matrix) > Error in contrasts.fit(fit, cont.matrix) : > trying to take contrast of non-estimable coefficient > In addition: Warning message: > In any(contrasts[-est, ]) : coercing argument of type 'double' to logical > fit3 <- eBayes(fit2) > > > Thank you > > Nadia Aubin-Horth > Assistant professor > Biology Department > Institut de Biologie Int?grative et des Syst?mes > Universit? Laval > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

ADD COMMENT • link 14.4 years ago James W. MacDonald 65k

0

Entering edit mode

Hi James Thank you for your response. We do have a dye-swap, which I represented with the "=" symbol between my biological replicates. Sorry for the confusion. Detailed experimental design (first sample is cy3, second sample is cy5): slide 1: A1-D1 slide 2: D1-A1 slide 3: A2-D2 slide 4: D2-A2 slide 5: A3-D3 slide 6: D3-A3 slide 7: A4-D4 slide 8: D4-A4 slide 9: A5-D5 slide 10: D5-A5 slide 11: A6-D6 slide 12: D6-A6 The comparison of interest is the average difference in expression between group A and group D, taking into account technical and biological replicates. A1 to A6 are different individuals of the group A, and D1 to D6 are different individuals of the group D, and these individuals (fish actually) are wild caught and may vary in gene expression even within the group. In your message you say: > All you can do is compare the two sample types. Since > the comparison is intrinsic to the data, all you want to do is compute > the mean of the ratios (or log ratios, actually) and then test to > see if > the mean is different from zero. > > In this very simple case, you don't have to create a design matrix, as > limma will produce one for you. So you just do: > > fit <- lmFit(MAptip.nba.scale) > fit2 <- eBayes(fit) > topTable(fit2) Reading the LIMMA User guide (section on technical replication, p.36), I was under the impression that since I wanted to keep information for each of the 6 biological replicates within a group in the test, as well as include technical replicates which are not independent, I needed to fit a linear model and then use a contrast. This is why we wanted to use one of the individuals as a reference in the design matrix and carry on with fitting the linear model and creating a contrast matrix that represents the average difference between A and D. When I test is.fullrank(design), I get "FALSE" as an answer. I suspect that our experimental design is the cause, but I may be wrong. Thank you Nadia Aubin-Horth Laval University On Dec 10, 2009, at 2:05 PM, James W. MacDonald wrote: > Hi Nadia, > > Aubin-Horth Nadia wrote: >> We are analysing a cDNA microarray dataset in LIMMA with the >> following >> design and we run into "Coefficients not estimable" comments. >> >> R = 2.9.0 >> limma=2.18 >> >> We have two groups, "A" and "D" with 6 biological replicates each >> (indicated in the targets file). We are interested in significant >> gene >> expression differences between A and D on average. A given biological >> replicate of "A" was compared to a biological replicate of "D", >> with a >> dye-swap. >> A1=D1 >> A2=D2 >> A3=D3 >> A4=D4 >> A5=D5 >> A6=D6 > > I don't see a dye-swap here. > >> >> Therefore, we have six mini-experiments that are not connected one to >> the other. >> >> After normalisation, we use teh following design with the goal of >> doing >> a contrast as shown below >> >> design <- modelMatrix(targets, ref="D1") >> design <- cbind(Dye=1, design) > > Without a dye-swap (and you don't indicate one above), you cannot > estimate dye bias. All you can do is compare the two sample types. > Since > the comparison is intrinsic to the data, all you want to do is compute > the mean of the ratios (or log ratios, actually) and then test to > see if > the mean is different from zero. > > In this very simple case, you don't have to create a design matrix, as > limma will produce one for you. So you just do: > > fit <- lmFit(MAptip.nba.scale) > fit2 <- eBayes(fit) > topTable(fit2) > > FYI, what you are doing is to treat each sample as if it isn't > replicated, which is why you are running into problems. > > Best, > > Jim > > >> >> fit<-lmFit(MAptip.nba.scale,design) >> >> here we get: >> Coefficients not estimable: D5 A4 D2 D6 D3 >> >> And we checked with >>> is.fullrank(design) >> and we get: >> [1] FALSE >> >> Our question is, is our experimental design (non loop, non reference >> design) with samples not being directly compared on a microarray >> causing >> these non estimable coefficients? If so, is there a way that we can >> keep >> the information on biological replicates and still use LIMMA? >> >> This is the contrast we were planning to use (which of course does >> not >> work) >> cont.matrix <- makeContrasts(AvsD = (A1 + A2 + A3 + A4 + A5 + A6 >> - D2 - D3 - D4 - D5 - D6)/6, levels=design) >> fit2 <- contrasts.fit(fit, cont.matrix) >> Error in contrasts.fit(fit, cont.matrix) : >> trying to take contrast of non-estimable coefficient >> In addition: Warning message: >> In any(contrasts[-est, ]) : coercing argument of type 'double' to >> logical >> fit3 <- eBayes(fit2) >> >> >> Thank you >> >> Nadia Aubin-Horth >> Assistant professor >> Biology Department >> Institut de Biologie Int?grative et des Syst?mes >> Universit? Laval >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- > James W. MacDonald, M.S. > Biostatistician > Douglas Lab > University of Michigan > Department of Human Genetics > 5912 Buhl > 1241 E. Catherine St. > Ann Arbor MI 48109-5618 > 734-615-7826 > ********************************************************** > Electronic Mail is not secure, may not be read every day, and should > not be used for urgent or sensitive issues >

ADD REPLY • link 14.4 years ago Aubin-Horth Nadia ▴ 60

0

Entering edit mode

Hi Nadia, Aubin-Horth Nadia wrote: > Hi James > > Thank you for your response. > > We do have a dye-swap, which I represented with the "=" symbol between > my biological replicates. Sorry for the confusion. > > Detailed experimental design (first sample is cy3, second sample is cy5): > slide 1: A1-D1 > slide 2: D1-A1 > slide 3: A2-D2 > slide 4: D2-A2 > slide 5: A3-D3 > slide 6: D3-A3 > slide 7: A4-D4 > slide 8: D4-A4 > slide 9: A5-D5 > slide 10: D5-A5 > slide 11: A6-D6 > slide 12: D6-A6 > > The comparison of interest is the average difference in expression > between group A and group D, taking into account technical and > biological replicates. A1 to A6 are different individuals of the group > A, and D1 to D6 are different individuals of the group D, and these > individuals (fish actually) are wild caught and may vary in gene > expression even within the group. > > In your message you say: > >> All you can do is compare the two sample types. Since >> the comparison is intrinsic to the data, all you want to do is compute >> the mean of the ratios (or log ratios, actually) and then test to see if >> the mean is different from zero. >> >> In this very simple case, you don't have to create a design matrix, as >> limma will produce one for you. So you just do: >> >> fit <- lmFit(MAptip.nba.scale) >> fit2 <- eBayes(fit) >> topTable(fit2) > > Reading the LIMMA User guide (section on technical replication, p.36), I > was under the impression that since I wanted to keep information for > each of the 6 biological replicates within a group in the test, as well > as include technical replicates which are not independent, I needed to > fit a linear model and then use a contrast. This is why we wanted to use > one of the individuals as a reference in the design matrix and carry on > with fitting the linear model and creating a contrast matrix that > represents the average difference between A and D. When I test > is.fullrank(design), I get "FALSE" as an answer. I suspect that our > experimental design is the cause, but I may be wrong. Yes, it is your design matrix. You don't show exactly what it looks like, but if it isn't of full rank it won't work regardless. I think you want to emulate the analysis on p37 (at least that's the page in my version of the Limma User's Guide). You have dye swaps and unlike the analysis I think you are following (which doesn't have equal technical replication), you have equal technical replication so can use duplicateCorrelation. The relevant section of the user's guide is this: "If the technical replicates were in dye-swap pairs as FileName Cy3 Cy5 File1 wt1 mu1 File2 mu1 wt1 File3 wt2 mu2 File4 mu2 wt2 then one might use > design <- c(1, -1, 1, -1) > corfit <- duplicateCorrelation(MA, design, ndups = 1, block = biolrep) > fit <- lmFit(MA, design, block = biolrep, cor = corfit$consensus) > fit <- eBayes(fit) > topTable(fit, adjust = "BH")" Where biolrep in this case is c(1,1,2,2). If you are concerned about dye-bias, you could augment the design matrix by adding a vector of all 1s as the first column. Best, Jim > > Thank you > > Nadia Aubin-Horth > Laval University > > On Dec 10, 2009, at 2:05 PM, James W. MacDonald wrote: > >> Hi Nadia, >> >> Aubin-Horth Nadia wrote: >>> We are analysing a cDNA microarray dataset in LIMMA with the following >>> design and we run into "Coefficients not estimable" comments. >>> >>> R = 2.9.0 >>> limma=2.18 >>> >>> We have two groups, "A" and "D" with 6 biological replicates each >>> (indicated in the targets file). We are interested in significant gene >>> expression differences between A and D on average. A given biological >>> replicate of "A" was compared to a biological replicate of "D", with a >>> dye-swap. >>> A1=D1 >>> A2=D2 >>> A3=D3 >>> A4=D4 >>> A5=D5 >>> A6=D6 >> >> I don't see a dye-swap here. >> >>> >>> Therefore, we have six mini-experiments that are not connected one to >>> the other. >>> >>> After normalisation, we use teh following design with the goal of doing >>> a contrast as shown below >>> >>> design <- modelMatrix(targets, ref="D1") >>> design <- cbind(Dye=1, design) >> >> Without a dye-swap (and you don't indicate one above), you cannot >> estimate dye bias. All you can do is compare the two sample types. Since >> the comparison is intrinsic to the data, all you want to do is compute >> the mean of the ratios (or log ratios, actually) and then test to see if >> the mean is different from zero. >> >> In this very simple case, you don't have to create a design matrix, as >> limma will produce one for you. So you just do: >> >> fit <- lmFit(MAptip.nba.scale) >> fit2 <- eBayes(fit) >> topTable(fit2) >> >> FYI, what you are doing is to treat each sample as if it isn't >> replicated, which is why you are running into problems. >> >> Best, >> >> Jim >> >> >>> >>> fit<-lmFit(MAptip.nba.scale,design) >>> >>> here we get: >>> Coefficients not estimable: D5 A4 D2 D6 D3 >>> >>> And we checked with >>>> is.fullrank(design) >>> and we get: >>> [1] FALSE >>> >>> Our question is, is our experimental design (non loop, non reference >>> design) with samples not being directly compared on a microarray causing >>> these non estimable coefficients? If so, is there a way that we can keep >>> the information on biological replicates and still use LIMMA? >>> >>> This is the contrast we were planning to use (which of course does not >>> work) >>> cont.matrix <- makeContrasts(AvsD = (A1 + A2 + A3 + A4 + A5 + A6 >>> - D2 - D3 - D4 - D5 - D6)/6, levels=design) >>> fit2 <- contrasts.fit(fit, cont.matrix) >>> Error in contrasts.fit(fit, cont.matrix) : >>> trying to take contrast of non-estimable coefficient >>> In addition: Warning message: >>> In any(contrasts[-est, ]) : coercing argument of type 'double' to >>> logical >>> fit3 <- eBayes(fit2) >>> >>> >>> Thank you >>> >>> Nadia Aubin-Horth >>> Assistant professor >>> Biology Department >>> Institut de Biologie Int?grative et des Syst?mes >>> Universit? Laval >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> Douglas Lab >> University of Michigan >> Department of Human Genetics >> 5912 Buhl >> 1241 E. Catherine St. >> Ann Arbor MI 48109-5618 >> 734-615-7826 >> ********************************************************** >> Electronic Mail is not secure, may not be read every day, and should >> not be used for urgent or sensitive issues >> > -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues

ADD REPLY • link 14.4 years ago James W. MacDonald 65k

Login before adding your answer.