Im another in agreement that you probably need to increase your
biological replication of microarray slides. Any microarray facility
has a certain experimental sensitivity. If the amount of information
you have is low then the actual distribution of the observed
proportion of false discoveries under simulation can become highly
variable as pi_0, the proportion of truely equilalently expressing
genes tends to 1 (depending on your cutoff alpha level), remembering
that FDR control is an expectation for any
realization of an experiment. So your list of differentially
expressing genes could contain many (or few) false discoveries but in
the long run should maintain control at ~pi_0*alpha under nonadaptive
FDR control.
Saurin, you mention that you have a large list of differentially
expressing genes using a nonadaptive FDR cutoff of alpha=0.01. As an
alternative approach you could analyze some of the chip comparisons of
interest using EBarrays (after checking distribution assumptions),
which models the entire exprSet using hierarchal Gamma Gamma
Bernoulli, or Log Normal Normal Bernoulli models, modelling the null
and alternate distributions as a mixture with unknown mixing
proportion p. The advantage of EBarrays is that
it can analyse a single cDNA two channel array or two affymetrix
single channel arrays, providing reasonable estimates of pi_0. The
same problem applies though with a lack of sensitivity without
biological replication (buts it better than fold change at least). You
could see *if* EBarrays predicts a value of pi_0 which is moderately
less than 1 for comparisons of interest.
Marcus
>>> Naomi Altman <naomi@stat.psu.edu> 27/04/2005 4:55:11 a.m. >>>
Significance should be based on biological replication. If the 2
chips for
group 3 are technical replicates, then the variance estimate for the
test
is probably too small.
In theory, statistical tests need only 2 replicate in a single
condition,
as the null distribution accounts for the number of replicates.
However,
for this theory to hold, the normality of the samples must be pretty
good. When the data are exactly normally distributed (and the
assumptions
for limma for the distribution of variance hold) then the FDR values
should
be pretty good, but the FNR will be poor (as you have no power).
However, I don't think anyone believes that microarray data are
normally
distributed. So, I would not really trust these results, even if you
have
a biological replicate. Of course the 2-fold rule is even worse, so
really
you should do more biological replication.
--Naomi
At 09:51 PM 4/26/2005, Saurin Jani wrote:
>Hi Adai,
>
>Yes, you are right. I have 4 samples :
>
>Group1 = Growth Effect for Day 1 : 1 Affy GeneChip.
>Group2 = Growth Effect for Day 2 : 1 Affy GeneChip.
>Group3 = Growth Effect for Day 3 : 2 Affy GeneChips.
>
>so, my design matrix is:
>design <- model.matrix(~ -1+factor(c(1,2,3,3)));
>
>LIMMA did not give any error or waring even it has 1
>sample per group...! ( I thought similar thing, since
>it needs technical replicates per group to make a
>decision). The results are very interesting. I have
>many genes for 0.01 FDR, which is very good.
>
>Somehow,I don't understand the logic. Do you think is
>this a valid design? Or You think I should go by Fold
>Change Logic. Please, let me know.
>
>Thank you very much,
>Saurin
>
>
>
>
>
>--- Adaikalavan Ramasamy <ramasamy@cancer.org.uk>
>wrote:
> > PLEASE correct me if I am wrong.
> >
> > You have a total of 4 samples that could be
> > classified into one of 3
> > groups ? How do you plan on distinguishing
> > biological from technical
> > variation ? Shouldn't limma come with some sort of
> > warning or error if
> > there are only one sample per group ?
> >
> > Regards, Adai
> >
> >
> >
> > On Tue, 2005-04-26 at 10:01 -0700, Saurin Jani
> > wrote:
> > > Hi BioC,
> > >
> > > I have 3 groups but I have only 2 replicates for
> > last
> > > group. so, group 1 and 2 has only one Affy CEL
> > file. I
> > > Did..LIMMA as below and I got some Exciting
> > results:
> > >
> > > #----------------------------------
> > > design <- model.matrix(~ -1+factor(c(1,2,3,3)));
> > > colnames(design) <- c("g1","g2","g3");
> > > fit <- lmFit(myRMA,design);
> > >
> > > contrast.matrix <-
> > > makeContrasts(g1-g2,g1-g3,g2-g3,levels = design);
> > >
> > > fit2 <- contrasts.fit(fit,contrast.matrix);
> > > fit2 <- eBayes(fit2);
> > >
> > > results <-
> > > decideTests(fit2,adjust="fdr",p.value=0.01);
> > >
> > > myGenes <- geneNames(myRMA);
> > > i <- apply(results,c(1,2),all);
> > >
> > > a <- i[,1];
> > > b <- i[,2];
> > > c <- i[,3];
> > > tempgenes1 <- myGenes[a];
> > > tempgenes2 <- myGenes[b];
> > > tempgenes3 <- myGenes[c];
> > >
> > > tempall <- c(tempgenes1,tempgenes2,tempgenes3);
> > > myDEGenes <- tempall;
> > >
> > > esetSub2X <- MatrixRMA[myDEGenes,];
> > > esetSub2 <- new("exprSet",exprs = esetSub2X);
> > > pData(esetSub2) <- pData(myRMA);
> > > heatmap(esetSub2X);
> > > #----------------------------------
> > >
> > > I got EXCITING results, what could be the
> > logic,since
> > > i have 2 replicates for 3rd group only ?
> > >
> > > Could anyone point me out ?
> > >
> > > I highly appreciate your help , Thank you in
> > advance.
> > >
> > > Thank you,
> > > Saurin
> > >
