Search
Question: limma warning: Coefficients not estimable
0
8.4 years ago by
k. brand420
k. brand420 wrote:
Dear BioC, Using limma, when fitting the model: model.matrix(~Tissue * Pperiod + Time + Animal) I get this warning: > fit <- lmFit(rma.pp, design) Coefficients not estimable: Animal32 Animal33 Animal34 Animal35 Animal36 Animal37 Animal38 Animal39 Animal40 Animal41 Animal42 Animal43 Animal44 Animal45 Animal46 Animal47 Animal48 Warning message: Partial NA coefficients for 45101 probe(s) In addition, the reuslting number or DE genes for my contrasts of interest (which are different than the 'not estimable' ones listed in teh warning above) are mcuh lower than expected; & furthermore, the contrast-coefficents (log2FCs) and simply wrong. When fitting a similar model, merely lacking the 'pairing' factor ("Animal"): model.matrix(~Tissue * Pperiod + Time) I don't get this error. My question: Is it me? Or am i attempting the impossible, ie., by including a factor for pairing (Animal) trying to fit more factors than my measurements can support and this is limma's way of telling me? Raw script and targets file below. I really hope an experienced limma user can enlighten me on this, or point me to a resource suitable for a biologists level of understanding. Thanks in advance, Karl > targets <- read.delim("RNA_Targets.txt") > Tissue <- factor(targets$Tissue, levels = c("R", "C")) > Pperiod <- factor(targets$Pperiod, levels = c("E", "L", "S")) > Time <- factor(targets$Time, levels = c("1", "2", "3", "4", + "5", "6", "7", "8", + .... [TRUNCATED] > Animal <- factor(targets$Animal, levels = c("1", "2", "3", "4", + "5", "6", "7", "8", + .... [TRUNCATED] > design <- model.matrix(~Tissue * Pperiod + Time + Animal) > colnames(design) [1] "(Intercept)" "TissueC" "PperiodL" "PperiodS" "Time2" "Time3" "Time4" "Time5" "Time6" "Time7" [11] "Time8" "Time9" "Time10" "Time11" "Time12" "Time13" "Time14" "Time15" "Time16" "Animal2" [21] "Animal3" "Animal4" "Animal5" "Animal6" "Animal7" "Animal8" "Animal9" "Animal10" "Animal11" "Animal12" [31] "Animal13" "Animal14" "Animal15" "Animal16" "Animal17" "Animal18" "Animal19" "Animal20" "Animal21" "Animal22" [41] "Animal23" "Animal24" "Animal25" "Animal26" "Animal27" "Animal28" "Animal29" "Animal30" "Animal31" "Animal32" [51] "Animal33" "Animal34" "Animal35" "Animal36" "Animal37" "Animal38" "Animal39" "Animal40" "Animal41" "Animal42" [61] "Animal43" "Animal44" "Animal45" "Animal46" "Animal47" "Animal48" "TissueC:PperiodL" "TissueC:PperiodS" > source(.trPaths[5], echo=TRUE, max.deparse.length=150) > fit <- lmFit(rma.pp, design) Coefficients not estimable: Animal32 Animal33 Animal34 Animal35 Animal36 Animal37 Animal38 Animal39 Animal40 Animal41 Animal42 Animal43 Animal44 Animal45 Animal46 Animal47 Animal48 Warning message: Partial NA coefficients for 45101 probe(s) > FileName Tissue Pperiod Time Animal 01-PPL3-sample02.CEL R S 1 1 02-PPL3-sample03.CEL C S 1 1 03-PPL5-sample02.CEL R S 2 2 04-PPL5-sample03.CEL C S 2 2 05-PPL3-sample04.CEL R S 3 3 06-PPL3-sample05.CEL C S 3 3 07-PPL5-sample04.CEL R S 4 4 08-PPL5-sample05.CEL C S 4 4 09-PPL3-sample06.CEL R S 5 5 10-PPL3-sample07.CEL C S 5 5 11-PPL5-sample06.CEL R S 6 6 12-PPL5-sample07.CEL C S 6 6 13-PPL3-sample08.CEL R S 7 7 14-PPL3-sample09.CEL C S 7 7 15-PPL5-sample08.CEL R S 8 8 16-PPL5-sample09.CEL C S 8 8 17-PPL3-sample10.CEL R S 9 9 18-PPL3-sample11.CEL C S 9 9 19-PPL5-sample10.CEL R S 10 10 20-PPL5-sample11.CEL C S 10 10 21-PPL3-sample12.CEL R S 11 11 22-PPL3-sample13.CEL C S 11 11 23-PPL5-sample12.CEL R S 12 12 24-PPL5-sample13.CEL C S 12 12 25-PPL3-sample14.CEL R S 13 13 26-PPL3-sample15.CEL C S 13 13 27-PPL5-sample14.CEL R S 14 14 28-PPL5-sample15.CEL C S 14 14 29-PPL3-sample16.CEL R S 15 15 30-PPL3-sample17.CEL C S 15 15 31-PPL5-sample16.CEL R S 16 16 32-PPL5-sample17.CEL C S 16 16 33-PPL1-sample02.CEL R E 1 17 34-PPL1-sample03.CEL C E 1 17 35-PPL6-sample02.CEL R E 2 18 36-PPL6-sample03.CEL C E 2 18 37-PPL1-sample04.CEL R E 3 19 38-PPL1-sample05.CEL C E 3 19 39-PPL6-sample04.CEL R E 4 20 40-PPL6-sample05.CEL C E 4 20 41-PPL1-sample06.CEL R E 5 21 42-PPL1-sample07.CEL C E 5 21 43-PPL6-sample06.CEL R E 6 22 44-PPL6-sample07.CEL C E 6 22 45-PPL1-sample08.CEL R E 7 23 46-PPL1-sample09.CEL C E 7 23 47-PPL6-sample08.CEL R E 8 24 48-PPL6-sample09.CEL C E 8 24 49-PPL1-sample10.CEL R E 9 25 50-PPL1-sample11.CEL C E 9 25 51-PPL6-sample10.CEL R E 10 26 52-PPL6-sample11.CEL C E 10 26 53-PPL1-sample12.CEL R E 11 27 54-PPL1-sample13.CEL C E 11 27 55-PPL6-sample12.CEL R E 12 28 56-PPL6-sample13.CEL C E 12 28 57-PPL1-sample14.CEL R E 13 29 58-PPL1-sample15.CEL C E 13 29 59-PPL6-sample14.CEL R E 14 30 60-PPL6-sample15.CEL C E 14 30 61-PPL1-sample16.CEL R E 15 31 62-PPL1-sample17.CEL C E 15 31 63-PPL6-sample16.CEL R E 16 32 64-PPL6-sample17.CEL C E 16 32 65-PPL2-sample02.CEL R L 1 33 66-PPL2-sample03.CEL C L 1 33 67-PPL4-sample02.CEL R L 2 34 68-PPL4-sample03.CEL C L 2 34 69-PPL2-sample04.CEL R L 3 35 70-PPL2-sample05.CEL C L 3 35 71-PPL4-sample04.CEL R L 4 36 72-PPL4-sample05.CEL C L 4 36 73-PPL2-sample06.CEL R L 5 37 74-PPL2-sample07.CEL C L 5 37 75-PPL4-sample06.CEL R L 6 38 76-PPL4-sample07.CEL C L 6 38 77-PPL2-sample08.CEL R L 7 39 78-PPL2-sample09.CEL C L 7 39 79-PPL4-sample08.CEL R L 8 40 80-PPL4-sample09.CEL C L 8 40 81-PPL2-sample10.CEL R L 9 41 82-PPL2-sample11.CEL C L 9 41 83-PPL4-sample10.CEL R L 10 42 84-PPL4-sample11.CEL C L 10 42 85-PPL2-sample12.CEL R L 11 43 86-PPL2-sample13.CEL C L 11 43 87-PPL4-sample12.CEL R L 12 44 88-PPL4-sample13.CEL C L 12 44 89-PPL2-sample14.CEL R L 13 45 90-PPL2-sample15.CEL C L 13 45 91-PPL4-sample14.CEL R L 14 46 92-PPL4-sample15.CEL C L 14 46 93-PPL2-sample16.CEL R L 15 47 94-PPL2-sample17.CEL C L 15 47 95-PPL4-sample16.CEL R L 16 48 96-PPL4-sample17.CEL C L 16 48 -- Karl Brand <k.brand at="" erasmusmc.nl=""> Department of Genetics Erasmus MC Dr Molewaterplein 50 3015 GE Rotterdam lab +31 (0)10 704 3409 fax +31 (0)10 704 4743 mob +31 (0)642 777 268
modified 8.4 years ago by Gordon Smyth33k • written 8.4 years ago by k. brand420
0
8.4 years ago by
United States
James W. MacDonald46k wrote:
Hi Karl, Karl Brand wrote: > Dear BioC, > > Using limma, when fitting the model: > model.matrix(~Tissue * Pperiod + Time + Animal) > > I get this warning: > > fit <- lmFit(rma.pp, design) > Coefficients not estimable: Animal32 Animal33 Animal34 Animal35 Animal36 > Animal37 Animal38 Animal39 Animal40 Animal41 Animal42 Animal43 Animal44 > Animal45 Animal46 Animal47 Animal48 > Warning message: > Partial NA coefficients for 45101 probe(s) > > In addition, the reuslting number or DE genes for my contrasts of > interest (which are different than the 'not estimable' ones listed in > teh warning above) are mcuh lower than expected; & furthermore, the > contrast-coefficents (log2FCs) and simply wrong. > > When fitting a similar model, merely lacking the 'pairing' factor > ("Animal"): > model.matrix(~Tissue * Pperiod + Time) > > I don't get this error. My question: > > Is it me? Or am i attempting the impossible, ie., by including a factor > for pairing (Animal) trying to fit more factors than my measurements can > support and this is limma's way of telling me? Raw script and targets > file below. You may be attempting the impossible, or you may just be doing something incorrectly. You are certainly trying to estimate more parameters than you have data with which to do so. It looks like you have a fairly complex experimental design, so I would recommend finding a local statistician who can help you with the analysis. > > I really hope an experienced limma user can enlighten me on this, or > point me to a resource suitable for a biologists level of understanding. Pretty much any basic linear modeling textbook would be helpful. However, it looks like you might have a timecourse experiment with perhaps repeated measures, which may require a non-trivial analysis method. As a Biologist, you might have jumped into the deep end of the pool, so finding somebody local to help is not a bad idea. Best, Jim > > Thanks in advance, > > Karl > > > > targets <- read.delim("RNA_Targets.txt") > > > Tissue <- factor(targets$Tissue, levels = c("R", "C")) > > > Pperiod <- factor(targets$Pperiod, levels = c("E", "L", "S")) > > > Time <- factor(targets$Time, levels = c("1", "2", "3", "4", > + "5", "6", "7", "8", > + .... [TRUNCATED] > > > Animal <- factor(targets$Animal, levels = c("1", "2", "3", "4", > + "5", "6", "7", "8", > + .... [TRUNCATED] > > > design <- model.matrix(~Tissue * Pperiod + Time + Animal) > > > colnames(design) > [1] "(Intercept)" "TissueC" "PperiodL" "PperiodS" > "Time2" "Time3" "Time4" "Time5" > "Time6" "Time7" > [11] "Time8" "Time9" "Time10" "Time11" > "Time12" "Time13" "Time14" > "Time15" "Time16" "Animal2" > [21] "Animal3" "Animal4" "Animal5" "Animal6" > "Animal7" "Animal8" "Animal9" > "Animal10" "Animal11" "Animal12" > [31] "Animal13" "Animal14" "Animal15" "Animal16" > "Animal17" "Animal18" "Animal19" > "Animal20" "Animal21" "Animal22" > [41] "Animal23" "Animal24" "Animal25" "Animal26" > "Animal27" "Animal28" "Animal29" > "Animal30" "Animal31" "Animal32" > [51] "Animal33" "Animal34" "Animal35" "Animal36" > "Animal37" "Animal38" "Animal39" > "Animal40" "Animal41" "Animal42" > [61] "Animal43" "Animal44" "Animal45" "Animal46" > "Animal47" "Animal48" "TissueC:PperiodL" > "TissueC:PperiodS" > > source(.trPaths[5], echo=TRUE, max.deparse.length=150) > > > fit <- lmFit(rma.pp, design) > Coefficients not estimable: Animal32 Animal33 Animal34 Animal35 Animal36 > Animal37 Animal38 Animal39 Animal40 Animal41 Animal42 Animal43 Animal44 > Animal45 Animal46 Animal47 Animal48 > Warning message: > Partial NA coefficients for 45101 probe(s) > > > > > FileName Tissue Pperiod Time Animal > 01-PPL3-sample02.CEL R S 1 1 > 02-PPL3-sample03.CEL C S 1 1 > 03-PPL5-sample02.CEL R S 2 2 > 04-PPL5-sample03.CEL C S 2 2 > 05-PPL3-sample04.CEL R S 3 3 > 06-PPL3-sample05.CEL C S 3 3 > 07-PPL5-sample04.CEL R S 4 4 > 08-PPL5-sample05.CEL C S 4 4 > 09-PPL3-sample06.CEL R S 5 5 > 10-PPL3-sample07.CEL C S 5 5 > 11-PPL5-sample06.CEL R S 6 6 > 12-PPL5-sample07.CEL C S 6 6 > 13-PPL3-sample08.CEL R S 7 7 > 14-PPL3-sample09.CEL C S 7 7 > 15-PPL5-sample08.CEL R S 8 8 > 16-PPL5-sample09.CEL C S 8 8 > 17-PPL3-sample10.CEL R S 9 9 > 18-PPL3-sample11.CEL C S 9 9 > 19-PPL5-sample10.CEL R S 10 10 > 20-PPL5-sample11.CEL C S 10 10 > 21-PPL3-sample12.CEL R S 11 11 > 22-PPL3-sample13.CEL C S 11 11 > 23-PPL5-sample12.CEL R S 12 12 > 24-PPL5-sample13.CEL C S 12 12 > 25-PPL3-sample14.CEL R S 13 13 > 26-PPL3-sample15.CEL C S 13 13 > 27-PPL5-sample14.CEL R S 14 14 > 28-PPL5-sample15.CEL C S 14 14 > 29-PPL3-sample16.CEL R S 15 15 > 30-PPL3-sample17.CEL C S 15 15 > 31-PPL5-sample16.CEL R S 16 16 > 32-PPL5-sample17.CEL C S 16 16 > 33-PPL1-sample02.CEL R E 1 17 > 34-PPL1-sample03.CEL C E 1 17 > 35-PPL6-sample02.CEL R E 2 18 > 36-PPL6-sample03.CEL C E 2 18 > 37-PPL1-sample04.CEL R E 3 19 > 38-PPL1-sample05.CEL C E 3 19 > 39-PPL6-sample04.CEL R E 4 20 > 40-PPL6-sample05.CEL C E 4 20 > 41-PPL1-sample06.CEL R E 5 21 > 42-PPL1-sample07.CEL C E 5 21 > 43-PPL6-sample06.CEL R E 6 22 > 44-PPL6-sample07.CEL C E 6 22 > 45-PPL1-sample08.CEL R E 7 23 > 46-PPL1-sample09.CEL C E 7 23 > 47-PPL6-sample08.CEL R E 8 24 > 48-PPL6-sample09.CEL C E 8 24 > 49-PPL1-sample10.CEL R E 9 25 > 50-PPL1-sample11.CEL C E 9 25 > 51-PPL6-sample10.CEL R E 10 26 > 52-PPL6-sample11.CEL C E 10 26 > 53-PPL1-sample12.CEL R E 11 27 > 54-PPL1-sample13.CEL C E 11 27 > 55-PPL6-sample12.CEL R E 12 28 > 56-PPL6-sample13.CEL C E 12 28 > 57-PPL1-sample14.CEL R E 13 29 > 58-PPL1-sample15.CEL C E 13 29 > 59-PPL6-sample14.CEL R E 14 30 > 60-PPL6-sample15.CEL C E 14 30 > 61-PPL1-sample16.CEL R E 15 31 > 62-PPL1-sample17.CEL C E 15 31 > 63-PPL6-sample16.CEL R E 16 32 > 64-PPL6-sample17.CEL C E 16 32 > 65-PPL2-sample02.CEL R L 1 33 > 66-PPL2-sample03.CEL C L 1 33 > 67-PPL4-sample02.CEL R L 2 34 > 68-PPL4-sample03.CEL C L 2 34 > 69-PPL2-sample04.CEL R L 3 35 > 70-PPL2-sample05.CEL C L 3 35 > 71-PPL4-sample04.CEL R L 4 36 > 72-PPL4-sample05.CEL C L 4 36 > 73-PPL2-sample06.CEL R L 5 37 > 74-PPL2-sample07.CEL C L 5 37 > 75-PPL4-sample06.CEL R L 6 38 > 76-PPL4-sample07.CEL C L 6 38 > 77-PPL2-sample08.CEL R L 7 39 > 78-PPL2-sample09.CEL C L 7 39 > 79-PPL4-sample08.CEL R L 8 40 > 80-PPL4-sample09.CEL C L 8 40 > 81-PPL2-sample10.CEL R L 9 41 > 82-PPL2-sample11.CEL C L 9 41 > 83-PPL4-sample10.CEL R L 10 42 > 84-PPL4-sample11.CEL C L 10 42 > 85-PPL2-sample12.CEL R L 11 43 > 86-PPL2-sample13.CEL C L 11 43 > 87-PPL4-sample12.CEL R L 12 44 > 88-PPL4-sample13.CEL C L 12 44 > 89-PPL2-sample14.CEL R L 13 45 > 90-PPL2-sample15.CEL C L 13 45 > 91-PPL4-sample14.CEL R L 14 46 > 92-PPL4-sample15.CEL C L 14 46 > 93-PPL2-sample16.CEL R L 15 47 > 94-PPL2-sample17.CEL C L 15 47 > 95-PPL4-sample16.CEL R L 16 48 > 96-PPL4-sample17.CEL C L 16 48 > ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
Cheers Jim, On 2/10/2010 5:57 PM, James W. MacDonald wrote: > Hi Karl, > > Karl Brand wrote: >> Dear BioC, >> >> Using limma, when fitting the model: >> model.matrix(~Tissue * Pperiod + Time + Animal) >> >> I get this warning: >> > fit <- lmFit(rma.pp, design) >> Coefficients not estimable: Animal32 Animal33 Animal34 Animal35 >> Animal36 Animal37 Animal38 Animal39 Animal40 Animal41 Animal42 >> Animal43 Animal44 Animal45 Animal46 Animal47 Animal48 >> Warning message: >> Partial NA coefficients for 45101 probe(s) >> >> In addition, the reuslting number or DE genes for my contrasts of >> interest (which are different than the 'not estimable' ones listed in >> teh warning above) are mcuh lower than expected; & furthermore, the >> contrast-coefficents (log2FCs) and simply wrong. >> >> When fitting a similar model, merely lacking the 'pairing' factor >> ("Animal"): >> model.matrix(~Tissue * Pperiod + Time) >> >> I don't get this error. My question: >> >> Is it me? Or am i attempting the impossible, ie., by including a >> factor for pairing (Animal) trying to fit more factors than my >> measurements can support and this is limma's way of telling me? Raw >> script and targets file below. > > You may be attempting the impossible, or you may just be doing something > incorrectly. You are certainly trying to estimate more parameters than > you have data with which to do so. Right. This helps me alot, some confirmation of what i can and cant achieve with my data. > > It looks like you have a fairly complex experimental design, so I would > recommend finding a local statistician who can help you with the analysis. > >> >> I really hope an experienced limma user can enlighten me on this, or >> point me to a resource suitable for a biologists level of understanding. > > Pretty much any basic linear modeling textbook would be helpful. > However, it looks like you might have a timecourse experiment with > perhaps repeated measures, which may require a non-trivial analysis > method. As a Biologist, you might have jumped into the deep end of the > pool, so finding somebody local to help is not a bad idea. Unfortunately learning to swim has been faster....along with your (and several other non-local statisticians) 'flotation aids'. sincere thanks for your thoughts, Karl > > Best, > > Jim > > >> >> Thanks in advance, >> >> Karl >> >> >> > targets <- read.delim("RNA_Targets.txt") >> >> > Tissue <- factor(targets$Tissue, levels = c("R", "C")) >> >> > Pperiod <- factor(targets$Pperiod, levels = c("E", "L", "S")) >> >> > Time <- factor(targets$Time, levels = c("1", "2", "3", "4", >> + "5", "6", "7", "8", >> + .... [TRUNCATED] >> >> > Animal <- factor(targets$Animal, levels = c("1", "2", "3", "4", >> + "5", "6", "7", "8", >> + .... [TRUNCATED] >> >> > design <- model.matrix(~Tissue * Pperiod + Time + Animal) >> >> > colnames(design) >> [1] "(Intercept)" "TissueC" "PperiodL" "PperiodS" "Time2" "Time3" >> "Time4" "Time5" "Time6" "Time7" >> [11] "Time8" "Time9" "Time10" "Time11" "Time12" "Time13" "Time14" >> "Time15" "Time16" "Animal2" >> [21] "Animal3" "Animal4" "Animal5" "Animal6" "Animal7" "Animal8" >> "Animal9" "Animal10" "Animal11" "Animal12" >> [31] "Animal13" "Animal14" "Animal15" "Animal16" "Animal17" "Animal18" >> "Animal19" "Animal20" "Animal21" "Animal22" >> [41] "Animal23" "Animal24" "Animal25" "Animal26" "Animal27" "Animal28" >> "Animal29" "Animal30" "Animal31" "Animal32" >> [51] "Animal33" "Animal34" "Animal35" "Animal36" "Animal37" "Animal38" >> "Animal39" "Animal40" "Animal41" "Animal42" >> [61] "Animal43" "Animal44" "Animal45" "Animal46" "Animal47" "Animal48" >> "TissueC:PperiodL" "TissueC:PperiodS" >> > source(.trPaths[5], echo=TRUE, max.deparse.length=150) >> >> > fit <- lmFit(rma.pp, design) >> Coefficients not estimable: Animal32 Animal33 Animal34 Animal35 >> Animal36 Animal37 Animal38 Animal39 Animal40 Animal41 Animal42 >> Animal43 Animal44 Animal45 Animal46 Animal47 Animal48 >> Warning message: >> Partial NA coefficients for 45101 probe(s) >> > >> >> >> FileName Tissue Pperiod Time Animal >> 01-PPL3-sample02.CEL R S 1 1 >> 02-PPL3-sample03.CEL C S 1 1 >> 03-PPL5-sample02.CEL R S 2 2 >> 04-PPL5-sample03.CEL C S 2 2 >> 05-PPL3-sample04.CEL R S 3 3 >> 06-PPL3-sample05.CEL C S 3 3 >> 07-PPL5-sample04.CEL R S 4 4 >> 08-PPL5-sample05.CEL C S 4 4 >> 09-PPL3-sample06.CEL R S 5 5 >> 10-PPL3-sample07.CEL C S 5 5 >> 11-PPL5-sample06.CEL R S 6 6 >> 12-PPL5-sample07.CEL C S 6 6 >> 13-PPL3-sample08.CEL R S 7 7 >> 14-PPL3-sample09.CEL C S 7 7 >> 15-PPL5-sample08.CEL R S 8 8 >> 16-PPL5-sample09.CEL C S 8 8 >> 17-PPL3-sample10.CEL R S 9 9 >> 18-PPL3-sample11.CEL C S 9 9 >> 19-PPL5-sample10.CEL R S 10 10 >> 20-PPL5-sample11.CEL C S 10 10 >> 21-PPL3-sample12.CEL R S 11 11 >> 22-PPL3-sample13.CEL C S 11 11 >> 23-PPL5-sample12.CEL R S 12 12 >> 24-PPL5-sample13.CEL C S 12 12 >> 25-PPL3-sample14.CEL R S 13 13 >> 26-PPL3-sample15.CEL C S 13 13 >> 27-PPL5-sample14.CEL R S 14 14 >> 28-PPL5-sample15.CEL C S 14 14 >> 29-PPL3-sample16.CEL R S 15 15 >> 30-PPL3-sample17.CEL C S 15 15 >> 31-PPL5-sample16.CEL R S 16 16 >> 32-PPL5-sample17.CEL C S 16 16 >> 33-PPL1-sample02.CEL R E 1 17 >> 34-PPL1-sample03.CEL C E 1 17 >> 35-PPL6-sample02.CEL R E 2 18 >> 36-PPL6-sample03.CEL C E 2 18 >> 37-PPL1-sample04.CEL R E 3 19 >> 38-PPL1-sample05.CEL C E 3 19 >> 39-PPL6-sample04.CEL R E 4 20 >> 40-PPL6-sample05.CEL C E 4 20 >> 41-PPL1-sample06.CEL R E 5 21 >> 42-PPL1-sample07.CEL C E 5 21 >> 43-PPL6-sample06.CEL R E 6 22 >> 44-PPL6-sample07.CEL C E 6 22 >> 45-PPL1-sample08.CEL R E 7 23 >> 46-PPL1-sample09.CEL C E 7 23 >> 47-PPL6-sample08.CEL R E 8 24 >> 48-PPL6-sample09.CEL C E 8 24 >> 49-PPL1-sample10.CEL R E 9 25 >> 50-PPL1-sample11.CEL C E 9 25 >> 51-PPL6-sample10.CEL R E 10 26 >> 52-PPL6-sample11.CEL C E 10 26 >> 53-PPL1-sample12.CEL R E 11 27 >> 54-PPL1-sample13.CEL C E 11 27 >> 55-PPL6-sample12.CEL R E 12 28 >> 56-PPL6-sample13.CEL C E 12 28 >> 57-PPL1-sample14.CEL R E 13 29 >> 58-PPL1-sample15.CEL C E 13 29 >> 59-PPL6-sample14.CEL R E 14 30 >> 60-PPL6-sample15.CEL C E 14 30 >> 61-PPL1-sample16.CEL R E 15 31 >> 62-PPL1-sample17.CEL C E 15 31 >> 63-PPL6-sample16.CEL R E 16 32 >> 64-PPL6-sample17.CEL C E 16 32 >> 65-PPL2-sample02.CEL R L 1 33 >> 66-PPL2-sample03.CEL C L 1 33 >> 67-PPL4-sample02.CEL R L 2 34 >> 68-PPL4-sample03.CEL C L 2 34 >> 69-PPL2-sample04.CEL R L 3 35 >> 70-PPL2-sample05.CEL C L 3 35 >> 71-PPL4-sample04.CEL R L 4 36 >> 72-PPL4-sample05.CEL C L 4 36 >> 73-PPL2-sample06.CEL R L 5 37 >> 74-PPL2-sample07.CEL C L 5 37 >> 75-PPL4-sample06.CEL R L 6 38 >> 76-PPL4-sample07.CEL C L 6 38 >> 77-PPL2-sample08.CEL R L 7 39 >> 78-PPL2-sample09.CEL C L 7 39 >> 79-PPL4-sample08.CEL R L 8 40 >> 80-PPL4-sample09.CEL C L 8 40 >> 81-PPL2-sample10.CEL R L 9 41 >> 82-PPL2-sample11.CEL C L 9 41 >> 83-PPL4-sample10.CEL R L 10 42 >> 84-PPL4-sample11.CEL C L 10 42 >> 85-PPL2-sample12.CEL R L 11 43 >> 86-PPL2-sample13.CEL C L 11 43 >> 87-PPL4-sample12.CEL R L 12 44 >> 88-PPL4-sample13.CEL C L 12 44 >> 89-PPL2-sample14.CEL R L 13 45 >> 90-PPL2-sample15.CEL C L 13 45 >> 91-PPL4-sample14.CEL R L 14 46 >> 92-PPL4-sample15.CEL C L 14 46 >> 93-PPL2-sample16.CEL R L 15 47 >> 94-PPL2-sample17.CEL C L 15 47 >> 95-PPL4-sample16.CEL R L 16 48 >> 96-PPL4-sample17.CEL C L 16 48 >> > ********************************************************** > Electronic Mail is not secure, may not be read every day, and should not > be used for urgent or sensitive issues -- Karl Brand <k.brand at="" erasmusmc.nl=""> Department of Genetics Erasmus MC Dr Molewaterplein 50 3015 GE Rotterdam lab +31 (0)10 704 3409 fax +31 (0)10 704 4743 mob +31 (0)642 777 268
0
8.4 years ago by
Gordon Smyth33k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth33k wrote:

Dear Karl,

You are trying to do the impossible.

The essential thing to appreciate is that your experiment has two levels of variability, within and between animals.

You have two samples from each animal, tissues R and C. Therefore tissue is your within-animal factor.  Your other two factors, Rperiod and Time are between-animal factors.

If you include Animal as a term in your design matrix formula, then you are comparing treatments within animal, and you can only test within-animal hypotheses.  Therefore you cannot include Pperiod or Time in

You have two choices.  One is to do all your analysis within animal. Therefore you fit the model (~Tissue + Animal) and test only the tissue effect. This fully adjusts for any Period or Time effects, but does not
allow you to test for them.

If you wish to recover information about Period and Time from the between-animal error strata, then you need to treat Animal as a random effect. This you do by:

design <- model.matrix(~Tissue * Pperiod + Time)
dupfit <- duplicateCorrelation(rma.pp, design, ndups=1, block=Animal)
fit <- lmFit(rma.pp, design, correlation=dupfit$consensus, block=Animal) remembering to check that dupfit$consensus is positive.

James has given you good advice about consulting a good biostatistician. (There are plenty of questions that could be asked. Do you really need to adjust for a Time variable with 16 levels etc.)

Best wishes
Gordon

Gordon,

My limited understanding of duplicateCorrelation lead me to believe that it's purpose was to specify in a design, measurements/samples expected to be similar, if not highly similar to each other, as is the case with technical and biological replicates, and attribute the differences between such specified replicates being due to technical/biological 'noise' via a modeling system different to the linear ones used for DE gene identification. Do i understand correctly?

In our experiment, although Tissue "R" & "C" are expected to be more similar than different on the whole genome level;

BTW-
> dupfit\$consensus
[1] 0.237422

our interest is in the *differences* as is probably obvious from the model.

So im just trying to get my biologists understanding c.clear that applying duplicateCorrelation to the Animal blocking as you suggested, is at least a *worthwhile thing to try*. I guess you wouldn't provide the
example if you didn't think so, but right now is a good time for emphatic clarity.

Of course i did try, and the results are logical to me and thus worth using. In short, the no. of DE genes for the contrasts between factor=Pperiod changed little (<5%), whilst the contrasts between factor=Tissue for "R" & "C" changed alot- DE genes increased by ~20% by inclusion of duplicateCorrelation.

Again, sincere cheers,

Karl

ADD REPLYlink modified 6 weeks ago by Gordon Smyth33k • written 8.4 years ago by k. brand420

Dear Karl,

The block argument of duplicateCorrelation() gives you a way to allow for correlation between repeat arrays on the same animal. It is a direct extension of the usual linear model. You found the correlation to be about 0.24, which is typical for data in which each block is an independent organism.

Yes, I wouldn't have suggested it if I didn't think it was worth trying. Indeed, if you want to examine Tissue and Pperiod effects in the same analysis, you must do it like this.

Best wishes
Gordon

Content
Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.