LIMMA : design (1, 2, 3, 3 ) , I got EXCITING results, what could be the logic, since i have 2 replicates for 3rd group only ?
3
0
Entering edit mode
SAURIN ★ 1.1k
@saurin-799
Last seen 9.6 years ago
Hi BioC, I have 3 groups but I have only 2 replicates for last group. so, group 1 and 2 has only one Affy CEL file. I Did..LIMMA as below and I got some Exciting results: #---------------------------------- design <- model.matrix(~ -1+factor(c(1,2,3,3))); colnames(design) <- c("g1","g2","g3"); fit <- lmFit(myRMA,design); contrast.matrix <- makeContrasts(g1-g2,g1-g3,g2-g3,levels = design); fit2 <- contrasts.fit(fit,contrast.matrix); fit2 <- eBayes(fit2); results <- decideTests(fit2,adjust="fdr",p.value=0.01); myGenes <- geneNames(myRMA); i <- apply(results,c(1,2),all); a <- i[,1]; b <- i[,2]; c <- i[,3]; tempgenes1 <- myGenes[a]; tempgenes2 <- myGenes[b]; tempgenes3 <- myGenes[c]; tempall <- c(tempgenes1,tempgenes2,tempgenes3); myDEGenes <- tempall; esetSub2X <- MatrixRMA[myDEGenes,]; esetSub2 <- new("exprSet",exprs = esetSub2X); pData(esetSub2) <- pData(myRMA); heatmap(esetSub2X); #---------------------------------- I got EXCITING results, what could be the logic,since i have 2 replicates for 3rd group only ? Could anyone point me out ? I highly appreciate your help , Thank you in advance. Thank you, Saurin
affy affy • 1.2k views
ADD COMMENT
0
Entering edit mode
@adaikalavan-ramasamy-675
Last seen 9.6 years ago
PLEASE correct me if I am wrong. You have a total of 4 samples that could be classified into one of 3 groups ? How do you plan on distinguishing biological from technical variation ? Shouldn't limma come with some sort of warning or error if there are only one sample per group ? Regards, Adai On Tue, 2005-04-26 at 10:01 -0700, Saurin Jani wrote: > Hi BioC, > > I have 3 groups but I have only 2 replicates for last > group. so, group 1 and 2 has only one Affy CEL file. I > Did..LIMMA as below and I got some Exciting results: > > #---------------------------------- > design <- model.matrix(~ -1+factor(c(1,2,3,3))); > colnames(design) <- c("g1","g2","g3"); > fit <- lmFit(myRMA,design); > > contrast.matrix <- > makeContrasts(g1-g2,g1-g3,g2-g3,levels = design); > > fit2 <- contrasts.fit(fit,contrast.matrix); > fit2 <- eBayes(fit2); > > results <- > decideTests(fit2,adjust="fdr",p.value=0.01); > > myGenes <- geneNames(myRMA); > i <- apply(results,c(1,2),all); > > a <- i[,1]; > b <- i[,2]; > c <- i[,3]; > tempgenes1 <- myGenes[a]; > tempgenes2 <- myGenes[b]; > tempgenes3 <- myGenes[c]; > > tempall <- c(tempgenes1,tempgenes2,tempgenes3); > myDEGenes <- tempall; > > esetSub2X <- MatrixRMA[myDEGenes,]; > esetSub2 <- new("exprSet",exprs = esetSub2X); > pData(esetSub2) <- pData(myRMA); > heatmap(esetSub2X); > #---------------------------------- > > I got EXCITING results, what could be the logic,since > i have 2 replicates for 3rd group only ? > > Could anyone point me out ? > > I highly appreciate your help , Thank you in advance. > > Thank you, > Saurin > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >
ADD COMMENT
0
Entering edit mode
SAURIN ★ 1.1k
@saurin-799
Last seen 9.6 years ago
Hi Adai, Yes, you are right. I have 4 samples : Group1 = Growth Effect for Day 1 : 1 Affy GeneChip. Group2 = Growth Effect for Day 2 : 1 Affy GeneChip. Group3 = Growth Effect for Day 3 : 2 Affy GeneChips. so, my design matrix is: design <- model.matrix(~ -1+factor(c(1,2,3,3))); LIMMA did not give any error or waring even it has 1 sample per group...! ( I thought similar thing, since it needs technical replicates per group to make a decision). The results are very interesting. I have many genes for 0.01 FDR, which is very good. Somehow,I don't understand the logic. Do you think is this a valid design? Or You think I should go by Fold Change Logic. Please, let me know. Thank you very much, Saurin --- Adaikalavan Ramasamy <ramasamy@cancer.org.uk> wrote: > PLEASE correct me if I am wrong. > > You have a total of 4 samples that could be > classified into one of 3 > groups ? How do you plan on distinguishing > biological from technical > variation ? Shouldn't limma come with some sort of > warning or error if > there are only one sample per group ? > > Regards, Adai > > > > On Tue, 2005-04-26 at 10:01 -0700, Saurin Jani > wrote: > > Hi BioC, > > > > I have 3 groups but I have only 2 replicates for > last > > group. so, group 1 and 2 has only one Affy CEL > file. I > > Did..LIMMA as below and I got some Exciting > results: > > > > #---------------------------------- > > design <- model.matrix(~ -1+factor(c(1,2,3,3))); > > colnames(design) <- c("g1","g2","g3"); > > fit <- lmFit(myRMA,design); > > > > contrast.matrix <- > > makeContrasts(g1-g2,g1-g3,g2-g3,levels = design); > > > > fit2 <- contrasts.fit(fit,contrast.matrix); > > fit2 <- eBayes(fit2); > > > > results <- > > decideTests(fit2,adjust="fdr",p.value=0.01); > > > > myGenes <- geneNames(myRMA); > > i <- apply(results,c(1,2),all); > > > > a <- i[,1]; > > b <- i[,2]; > > c <- i[,3]; > > tempgenes1 <- myGenes[a]; > > tempgenes2 <- myGenes[b]; > > tempgenes3 <- myGenes[c]; > > > > tempall <- c(tempgenes1,tempgenes2,tempgenes3); > > myDEGenes <- tempall; > > > > esetSub2X <- MatrixRMA[myDEGenes,]; > > esetSub2 <- new("exprSet",exprs = esetSub2X); > > pData(esetSub2) <- pData(myRMA); > > heatmap(esetSub2X); > > #---------------------------------- > > > > I got EXCITING results, what could be the > logic,since > > i have 2 replicates for 3rd group only ? > > > > Could anyone point me out ? > > > > I highly appreciate your help , Thank you in > advance. > > > > Thank you, > > Saurin > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > >
ADD COMMENT
0
Entering edit mode
Significance should be based on biological replication. If the 2 chips for group 3 are technical replicates, then the variance estimate for the test is probably too small. In theory, statistical tests need only 2 replicate in a single condition, as the null distribution accounts for the number of replicates. However, for this theory to hold, the normality of the samples must be pretty good. When the data are exactly normally distributed (and the assumptions for limma for the distribution of variance hold) then the FDR values should be pretty good, but the FNR will be poor (as you have no power). However, I don't think anyone believes that microarray data are normally distributed. So, I would not really trust these results, even if you have a biological replicate. Of course the 2-fold rule is even worse, so really you should do more biological replication. --Naomi At 09:51 PM 4/26/2005, Saurin Jani wrote: >Hi Adai, > >Yes, you are right. I have 4 samples : > >Group1 = Growth Effect for Day 1 : 1 Affy GeneChip. >Group2 = Growth Effect for Day 2 : 1 Affy GeneChip. >Group3 = Growth Effect for Day 3 : 2 Affy GeneChips. > >so, my design matrix is: >design <- model.matrix(~ -1+factor(c(1,2,3,3))); > >LIMMA did not give any error or waring even it has 1 >sample per group...! ( I thought similar thing, since >it needs technical replicates per group to make a >decision). The results are very interesting. I have >many genes for 0.01 FDR, which is very good. > >Somehow,I don't understand the logic. Do you think is >this a valid design? Or You think I should go by Fold >Change Logic. Please, let me know. > >Thank you very much, >Saurin > > > > > >--- Adaikalavan Ramasamy <ramasamy@cancer.org.uk> >wrote: > > PLEASE correct me if I am wrong. > > > > You have a total of 4 samples that could be > > classified into one of 3 > > groups ? How do you plan on distinguishing > > biological from technical > > variation ? Shouldn't limma come with some sort of > > warning or error if > > there are only one sample per group ? > > > > Regards, Adai > > > > > > > > On Tue, 2005-04-26 at 10:01 -0700, Saurin Jani > > wrote: > > > Hi BioC, > > > > > > I have 3 groups but I have only 2 replicates for > > last > > > group. so, group 1 and 2 has only one Affy CEL > > file. I > > > Did..LIMMA as below and I got some Exciting > > results: > > > > > > #---------------------------------- > > > design <- model.matrix(~ -1+factor(c(1,2,3,3))); > > > colnames(design) <- c("g1","g2","g3"); > > > fit <- lmFit(myRMA,design); > > > > > > contrast.matrix <- > > > makeContrasts(g1-g2,g1-g3,g2-g3,levels = design); > > > > > > fit2 <- contrasts.fit(fit,contrast.matrix); > > > fit2 <- eBayes(fit2); > > > > > > results <- > > > decideTests(fit2,adjust="fdr",p.value=0.01); > > > > > > myGenes <- geneNames(myRMA); > > > i <- apply(results,c(1,2),all); > > > > > > a <- i[,1]; > > > b <- i[,2]; > > > c <- i[,3]; > > > tempgenes1 <- myGenes[a]; > > > tempgenes2 <- myGenes[b]; > > > tempgenes3 <- myGenes[c]; > > > > > > tempall <- c(tempgenes1,tempgenes2,tempgenes3); > > > myDEGenes <- tempall; > > > > > > esetSub2X <- MatrixRMA[myDEGenes,]; > > > esetSub2 <- new("exprSet",exprs = esetSub2X); > > > pData(esetSub2) <- pData(myRMA); > > > heatmap(esetSub2X); > > > #---------------------------------- > > > > > > I got EXCITING results, what could be the > > logic,since > > > i have 2 replicates for 3rd group only ? > > > > > > Could anyone point me out ? > > > > > > I highly appreciate your help , Thank you in > > advance. > > > > > > Thank you, > > > Saurin > > > > > > _______________________________________________ > > > Bioconductor mailing list > > > Bioconductor@stat.math.ethz.ch > > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > > > > > > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Bioinformatics Consulting Center Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD REPLY
0
Entering edit mode
Saurin, I agree with Naomi. Increase your sample size. You cannot rely on the results from your current design as the ONE sample per group may not be representative of the population you want to study. Moreover, others would not trust your result. Should a warning be added to LIMMA when users attempt to use single array per group ? Regards, Adai On Tue, 2005-04-26 at 22:55 -0400, Naomi Altman wrote: > Significance should be based on biological replication. If the 2 chips for > group 3 are technical replicates, then the variance estimate for the test > is probably too small. > In theory, statistical tests need only 2 replicate in a single condition, > as the null distribution accounts for the number of replicates. However, > for this theory to hold, the normality of the samples must be pretty > good. When the data are exactly normally distributed (and the assumptions > for limma for the distribution of variance hold) then the FDR values should > be pretty good, but the FNR will be poor (as you have no power). > > However, I don't think anyone believes that microarray data are normally > distributed. So, I would not really trust these results, even if you have > a biological replicate. Of course the 2-fold rule is even worse, so really > you should do more biological replication. > > --Naomi > > At 09:51 PM 4/26/2005, Saurin Jani wrote: > >Hi Adai, > > > >Yes, you are right. I have 4 samples : > > > >Group1 = Growth Effect for Day 1 : 1 Affy GeneChip. > >Group2 = Growth Effect for Day 2 : 1 Affy GeneChip. > >Group3 = Growth Effect for Day 3 : 2 Affy GeneChips. > > > >so, my design matrix is: > >design <- model.matrix(~ -1+factor(c(1,2,3,3))); > > > >LIMMA did not give any error or waring even it has 1 > >sample per group...! ( I thought similar thing, since > >it needs technical replicates per group to make a > >decision). The results are very interesting. I have > >many genes for 0.01 FDR, which is very good. > > > >Somehow,I don't understand the logic. Do you think is > >this a valid design? Or You think I should go by Fold > >Change Logic. Please, let me know. > > > >Thank you very much, > >Saurin > > > > > > > > > > > >--- Adaikalavan Ramasamy <ramasamy@cancer.org.uk> > >wrote: > > > PLEASE correct me if I am wrong. > > > > > > You have a total of 4 samples that could be > > > classified into one of 3 > > > groups ? How do you plan on distinguishing > > > biological from technical > > > variation ? Shouldn't limma come with some sort of > > > warning or error if > > > there are only one sample per group ? > > > > > > Regards, Adai > > > > > > > > > > > > On Tue, 2005-04-26 at 10:01 -0700, Saurin Jani > > > wrote: > > > > Hi BioC, > > > > > > > > I have 3 groups but I have only 2 replicates for > > > last > > > > group. so, group 1 and 2 has only one Affy CEL > > > file. I > > > > Did..LIMMA as below and I got some Exciting > > > results: > > > > > > > > #---------------------------------- > > > > design <- model.matrix(~ -1+factor(c(1,2,3,3))); > > > > colnames(design) <- c("g1","g2","g3"); > > > > fit <- lmFit(myRMA,design); > > > > > > > > contrast.matrix <- > > > > makeContrasts(g1-g2,g1-g3,g2-g3,levels = design); > > > > > > > > fit2 <- contrasts.fit(fit,contrast.matrix); > > > > fit2 <- eBayes(fit2); > > > > > > > > results <- > > > > decideTests(fit2,adjust="fdr",p.value=0.01); > > > > > > > > myGenes <- geneNames(myRMA); > > > > i <- apply(results,c(1,2),all); > > > > > > > > a <- i[,1]; > > > > b <- i[,2]; > > > > c <- i[,3]; > > > > tempgenes1 <- myGenes[a]; > > > > tempgenes2 <- myGenes[b]; > > > > tempgenes3 <- myGenes[c]; > > > > > > > > tempall <- c(tempgenes1,tempgenes2,tempgenes3); > > > > myDEGenes <- tempall; > > > > > > > > esetSub2X <- MatrixRMA[myDEGenes,]; > > > > esetSub2 <- new("exprSet",exprs = esetSub2X); > > > > pData(esetSub2) <- pData(myRMA); > > > > heatmap(esetSub2X); > > > > #---------------------------------- > > > > > > > > I got EXCITING results, what could be the > > > logic,since > > > > i have 2 replicates for 3rd group only ? > > > > > > > > Could anyone point me out ? > > > > > > > > I highly appreciate your help , Thank you in > > > advance. > > > > > > > > Thank you, > > > > Saurin > > > > > > > > _______________________________________________ > > > > Bioconductor mailing list > > > > Bioconductor@stat.math.ethz.ch > > > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > > > > > > > > > > > >_______________________________________________ > >Bioconductor mailing list > >Bioconductor@stat.math.ethz.ch > >https://stat.ethz.ch/mailman/listinfo/bioconductor > > Naomi S. Altman 814-865-3791 (voice) > Associate Professor > Bioinformatics Consulting Center > Dept. of Statistics 814-863-7114 (fax) > Penn State University 814-865-1348 (Statistics) > University Park, PA 16802-2111 > >
ADD REPLY
0
Entering edit mode
On Apr 26, 2005, at 9:51 PM, Saurin Jani wrote: > Hi Adai, > > Yes, you are right. I have 4 samples : > > Group1 = Growth Effect for Day 1 : 1 Affy GeneChip. > Group2 = Growth Effect for Day 2 : 1 Affy GeneChip. > Group3 = Growth Effect for Day 3 : 2 Affy GeneChips. > > so, my design matrix is: > design <- model.matrix(~ -1+factor(c(1,2,3,3))); > > LIMMA did not give any error or waring even it has 1 > sample per group...! ( I thought similar thing, since > it needs technical replicates per group to make a > decision). The results are very interesting. I have > many genes for 0.01 FDR, which is very good. > > Somehow,I don't understand the logic. Do you think is > this a valid design? Or You think I should go by Fold > Change Logic. Please, let me know. Limma can and does use a "pooled" variance estimate, so the estimate of variance used here, though not "within-gene", is probably not too far off (i.e., you can have an estimate of the variance with only one array in a group). Without replicates, that estimate is certainly subject to more error than with replicates. However, fold-change is probably even one more step away from any statistical footing, as it includes NO estimate of variance and probably offers little or no advantage. That said, high fold-changes and low p-values are probably more likely than low fold-changes and higher p-values to be real, but I don't think that either measure is likely to be very robust. Arguments can proceed down the statistical road, but obviously the best (only) option that will produce statistically meaningful results would be to do more arrays. Short of that, I think the fold-change and limma are likely in practice to produce similar ordered lists that could, at best (and with the knowledge that the order is to be taken with a large grain of salt), be used to guide biologic validation or further experiments. Sean
ADD REPLY
0
Entering edit mode
@gordon-smyth
Last seen 28 minutes ago
WEHI, Melbourne, Australia

Just to elaborate slightly on Sean's response, the idea of the empirical Bayes "pooling" strategy used in limma is that you don't need to choose between using a fold change strategy or using t-tests. Rather the software moves you on sliding scale between these two strategies depending on how much information there is about the variances in the data and how different the variances seem to be. In your situation, with only 1 df for error, the limma rankings will usually be much closer to fold-change ranking than to a t-test ranking. Even here the moderated t-statistic approach is usually still preferable over ranking on fold change because genes for which the available replicates disagree will get down-weighted.

Gordon

ADD COMMENT

Login before adding your answer.

Traffic: 842 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6