Hello to everyones,
the experiments that I have to consider is very simple:
I want to find significant genes between 2 conditions A and B, but I
have only few experiment so I have to collect both ref versus
conditions (A or B) either dye swap experiment (A versus B and B
versus
A)
so targets is
SlideNumber Cy3 Cy5
array1 ref A
array2 ref B
array3 ref B
array4 ref B
array5 A B
array6 B A
of course array5 and array6 are the dye-swap.
So to design the procedure, I follow the LIMMA user guide (by Gordon
Smith), Chapter 14.5 Weaver Mutant Data.
so
>design <- modelMatrix(targets, ref = "ref")
Found unique target names:
B A ref
>design
A B
array1 0 1
array2 1 0
array3 1 0
array4 1 0
array5 -1 1
array6 1 -1
>fit <- lmFit(MA,design)
>cont.matrix <-
makeContrasts(A.B=A-B,levels=design,weight=MA$weights)
>fit2 <- contrasts.fit(fit, cont.matrix)
> fit2 <- eBayes(fit2)
>topTable(fit2,adjust.method="fdr")
....omissis...
M A t
P.Value B
209 3.801460 6.538782 8.315672 1.0000000
-4.209468
2328 1.184194 7.343676 6.717978 1.0000000 -4.228492
7877 1.904360 6.504330 6.114349 1.0000000 -4.239110
27187 -4.0759493.771499 -5.783558 1.0000000 -4.246099
3709 3.434542 3.467492 5.639159 1.0000000 -4.249459
7561 2.002753 5.159913 5.616194 1.0000000 -4.250013
7130 2.580527 3.863867 5.600047 1.0000000 -4.250405
19983 -2.1176246.836539 -5.567882 1.0000000 -4.251194
So all genes have P.Value equal to 1!!!!!!
in previous posts I read that this happen when you have to consider
multivariate test, which i don't known how to manage..., but anyway
1) Am I doing something wrong in the design?
2) Am I doing something wrong in the subsequent evaluation steps?
Any ideas
Thank you to all
Silvano
Dr.Silvano Piazza
LNCIB,
Area Science Park,
Padriciano 99
Trieste, ITALY
Tel. +39040398992
Fax +39040398990
Dear Silvano,
As I have indicated elsewhere on this list, the "p-values" reported by
TopTable are actually "q-values". Hence, if you have fewer
"significant"
genes than expected by chance under the null hypothesis, the reported
p-value is 1.0.
e.g. Suppose you have 1000 genes. Then if the number of genes
significant
at alpha% is less than 1000*alpha for each alpha, your TopTable
p-value
will be 1.0 (i.e. all of the significant genes are estimated to be
false
positives).
Your experiment design is needlessly complex and also wasteful. If
you
have only 2 conditions, you should do one of the following:
hybridize both conditions to every array (in dye-swap pairs) with no
technical replicates (This is most efficient)
use a reference design with the reference sample always in the same
channel. (This is simplest, but has 1/2 the efficiency.)
Mixing these 2 designs, especially with a mix of biological and
technical
replicates needlessly complicates your analysis. It also requires a
mixed
model ANOVA to take into account the different levels of replication.
--Naomi
At 10:28 AM 3/25/2005, Silvano Piazza wrote:
>Hello to everyones,
>the experiments that I have to consider is very simple:
>
>I want to find significant genes between 2 conditions A and B, but I
have
>only few experiment so I have to collect both ref versus conditions
(A or
>B) either dye swap experiment (A versus B and B versus A)
>
>so targets is
>SlideNumber Cy3 Cy5
>array1 ref A
>array2 ref B
>array3 ref B
>array4 ref B
>array5 A B
>array6 B A
>
>of course array5 and array6 are the dye-swap.
>
>So to design the procedure, I follow the LIMMA user guide (by Gordon
>Smith), Chapter 14.5 Weaver Mutant Data.
>
>so
> >design <- modelMatrix(targets, ref = "ref")
> Found unique target names:
> B A ref
> >design
> A B
> array1 0 1
> array2 1 0
> array3 1 0
> array4 1 0
> array5 -1 1
> array6 1 -1
> >fit <- lmFit(MA,design)
> >cont.matrix <-
makeContrasts(A.B=A-B,levels=design,weight=MA$weights)
> >fit2 <- contrasts.fit(fit, cont.matrix)
> > fit2 <- eBayes(fit2)
> >topTable(fit2,adjust.method="fdr")
> ....omissis...
> M A t
> P.Value B
> 209 3.801460 6.538782 8.315672 1.0000000
> -4.209468
> 2328 1.184194 7.343676 6.717978 1.0000000
-4.228492
> 7877 1.904360 6.504330 6.114349 1.0000000
-4.239110
> 27187 -4.0759493.771499 -5.783558 1.0000000 -4.246099
> 3709 3.434542 3.467492 5.639159 1.0000000 -4.249459
> 7561 2.002753 5.159913 5.616194 1.0000000 -4.250013
> 7130 2.580527 3.863867 5.600047 1.0000000 -4.250405
> 19983 -2.1176246.836539 -5.567882 1.0000000 -4.251194
>So all genes have P.Value equal to 1!!!!!!
>in previous posts I read that this happen when you have to consider
>multivariate test, which i don't known how to manage..., but anyway
>
>1) Am I doing something wrong in the design?
>2) Am I doing something wrong in the subsequent evaluation steps?
>Any ideas
>
>
>
>Thank you to all
>
>Silvano
>
>
>
>
>
>
>
>
>
>
>
>Dr.Silvano Piazza
>LNCIB,
>Area Science Park,
>Padriciano 99
>Trieste, ITALY
>Tel. +39040398992
>Fax +39040398990
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor@stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
Naomi S. Altman 814-865-3791 (voice)
Associate Professor
Bioinformatics Consulting Center
Dept. of Statistics 814-863-7114 (fax)
Penn State University 814-865-1348
(Statistics)
University Park, PA 16802-2111
Dear Naomi
First of all, thank you very much for your answer.
>
>
> As I have indicated elsewhere on this list, the "p-values" reported
by
> TopTable are actually "q-values". Hence, if you have fewer
> "significant" genes than expected by chance under the null
hypothesis,
> the reported p-value is 1.0.
>
> e.g. Suppose you have 1000 genes. Then if the number of genes
> significant at alpha% is less than 1000*alpha for each alpha, your
> TopTable p-value will be 1.0 (i.e. all of the significant genes are
> estimated to be false positives).
>
that's very clear now, thanks again.
> Your experiment design is needlessly complex and also wasteful. If
> you have only 2 conditions, you should do one of the following:
>
> hybridize both conditions to every array (in dye-swap pairs) with no
> technical replicates (This is most efficient)
> use a reference design with the reference sample always in the same
> channel. (This is simplest, but has 1/2 the efficiency.)
>
> Mixing these 2 designs, especially with a mix of biological and
> technical replicates needlessly complicates your analysis. It also
> requires a mixed model ANOVA to take into account the different
levels
> of replication.
>
Yes, I know I know....
but unfortunately I could not decide, in this case, how to make
the
experiments, so my situation is: these experiments are available at
the
moment and I have to find out DE genes, and only for this reason I was
wondering if there is any correct methods to work in "mixed" (exp vs
ref and dye-swap) design, thats means to extract more information that
it is possible.
Thank you
Silvano
> --Naomi
>
> At 10:28 AM 3/25/2005, Silvano Piazza wrote:
>> Hello to everyones,
>> the experiments that I have to consider is very simple:
>>
>> I want to find significant genes between 2 conditions A and B, but
I
>> have only few experiment so I have to collect both ref versus
>> conditions (A or B) either dye swap experiment (A versus B and B
>> versus A)
>>
>> so targets is
>> SlideNumber Cy3 Cy5
>> array1 ref A
>> array2 ref B
>> array3 ref B
>> array4 ref B
>> array5 A B
>> array6 B A
>>
>> of course array5 and array6 are the dye-swap.
>>
>> So to design the procedure, I follow the LIMMA user guide (by
Gordon
>> Smith), Chapter 14.5 Weaver Mutant Data.
>>
>> so
>> >design <- modelMatrix(targets, ref = "ref")
>> Found unique target names:
>> B A ref
>> >design
>> A B
>> array1 0 1
>> array2 1 0
>> array3 1 0
>> array4 1 0
>> array5 -1 1
>> array6 1 -1
>> >fit <- lmFit(MA,design)
>> >cont.matrix <-
makeContrasts(A.B=A-B,levels=design,weight=MA$weights)
>> >fit2 <- contrasts.fit(fit, cont.matrix)
>> > fit2 <- eBayes(fit2)
>> >topTable(fit2,adjust.method="fdr")
>> ....omissis...
>> M A t
>> P.Value B
>> 209 3.801460 6.538782 8.315672
1.0000000
>> -4.209468
>> 2328 1.184194 7.343676 6.717978 1.0000000
-4.228492
>> 7877 1.904360 6.504330 6.114349 1.0000000
-4.239110
>> 27187 -4.0759493.771499 -5.783558 1.0000000 -4.246099
>> 3709 3.434542 3.467492 5.639159 1.0000000 -4.249459
>> 7561 2.002753 5.159913 5.616194 1.0000000 -4.250013
>> 7130 2.580527 3.863867 5.600047 1.0000000 -4.250405
>> 19983 -2.1176246.836539 -5.567882 1.0000000 -4.251194
>> So all genes have P.Value equal to 1!!!!!!
>> in previous posts I read that this happen when you have to consider
>> multivariate test, which i don't known how to manage..., but anyway
>>
>> 1) Am I doing something wrong in the design?
>> 2) Am I doing something wrong in the subsequent evaluation steps?
>> Any ideas
>>
>>
>>
>> Thank you to all
>>
>> Silvano
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Dr.Silvano Piazza
>> LNCIB,
>> Area Science Park,
>> Padriciano 99
>> Trieste, ITALY
>> Tel. +39040398992
>> Fax +39040398990
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor@stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
> Naomi S. Altman 814-865-3791 (voice)
> Associate Professor
> Bioinformatics Consulting Center
> Dept. of Statistics 814-863-7114 (fax)
> Penn State University 814-865-1348
(Statistics)
> University Park, PA 16802-2111
>
>
>
Dr.Silvano Piazza
LNCIB,
Area Science Park,
Padriciano 99
Trieste, ITALY
Tel. +39040398992
Fax +39040398990
> Date: Tue, 29 Mar 2005 11:22:22 +0200
> From: Silvano Piazza <piazza@lncib.it>
> Subject: Re: [BioC] design in mixed ref and dye-swap experiment
> To: Naomi Altman <naomi@stat.psu.edu>
> Cc: bioconductor@stat.math.ethz.ch
> Message-ID: <107fbaba4c9ce4b7c1ecc6051aac92ff@lncib.it>
> Content-Type: text/plain; charset=US-ASCII; format=flowed
>
> Dear Naomi
>
> First of all, thank you very much for your answer.
>>
>>
>> As I have indicated elsewhere on this list, the "p-values" reported
by
>> TopTable are actually "q-values". Hence, if you have fewer
>> "significant" genes than expected by chance under the null
hypothesis,
>> the reported p-value is 1.0.
>>
>> e.g. Suppose you have 1000 genes. Then if the number of genes
>> significant at alpha% is less than 1000*alpha for each alpha, your
>> TopTable p-value will be 1.0 (i.e. all of the significant genes are
>> estimated to be false positives).
>>
>
> that's very clear now, thanks again.
>
>
>> Your experiment design is needlessly complex and also wasteful. If
>> you have only 2 conditions, you should do one of the following:
>>
>> hybridize both conditions to every array (in dye-swap pairs) with
no
>> technical replicates (This is most efficient)
>> use a reference design with the reference sample always in the same
>> channel. (This is simplest, but has 1/2 the efficiency.)
>>
>> Mixing these 2 designs, especially with a mix of biological and
>> technical replicates needlessly complicates your analysis. It also
>> requires a mixed model ANOVA to take into account the different
levels
>> of replication.
>>
>
> Yes, I know I know....
> but unfortunately I could not decide, in this case, how to make
the
> experiments, so my situation is: these experiments are available at
the
> moment and I have to find out DE genes, and only for this reason I
was
> wondering if there is any correct methods to work in "mixed" (exp vs
> ref and dye-swap) design, thats means to extract more information
that
> it is possible.
You analysis is already correct, given the arrays that you have.
If you are expecting to see differential expression here but aren't,
you might revisit the
pre-processing and QC steps for this data. Good pre-processing can
make a spectactular difference
to differential expression results.
Gordon
> Thank you
>
> Silvano
>
>> --Naomi
>>
>> At 10:28 AM 3/25/2005, Silvano Piazza wrote:
>>> Hello to everyones,
>>> the experiments that I have to consider is very simple:
>>>
>>> I want to find significant genes between 2 conditions A and B,
but I
>>> have only few experiment so I have to collect both ref versus
>>> conditions (A or B) either dye swap experiment (A versus B and B
>>> versus A)
>>>
>>> so targets is
>>> SlideNumber Cy3 Cy5
>>> array1 ref A
>>> array2 ref B
>>> array3 ref B
>>> array4 ref B
>>> array5 A B
>>> array6 B A
>>>
>>> of course array5 and array6 are the dye-swap.
>>>
>>> So to design the procedure, I follow the LIMMA user guide (by
Gordon
>>> Smith), Chapter 14.5 Weaver Mutant Data.
>>>
>>> so
>>> >design <- modelMatrix(targets, ref = "ref")
>>> Found unique target names:
>>> B A ref
>>> >design
>>> A B
>>> array1 0 1
>>> array2 1 0
>>> array3 1 0
>>> array4 1 0
>>> array5 -1 1
>>> array6 1 -1
>>> >fit <- lmFit(MA,design)
>>> >cont.matrix <-
makeContrasts(A.B=A-B,levels=design,weight=MA$weights)
Why are you using 'weights='? That is not an argument for
makeContrasts().
Gordon
>>> >fit2 <- contrasts.fit(fit, cont.matrix)
>>> > fit2 <- eBayes(fit2)
>>> >topTable(fit2,adjust.method="fdr")
>>> ....omissis...
>>> M A t
>>> P.Value B
>>> 209 3.801460 6.538782 8.315672
1.0000000
>>> -4.209468
>>> 2328 1.184194 7.343676 6.717978 1.0000000
-4.228492
>>> 7877 1.904360 6.504330 6.114349 1.0000000
-4.239110
>>> 27187 -4.0759493.771499 -5.783558 1.0000000 -4.246099
>>> 3709 3.434542 3.467492 5.639159 1.0000000 -4.249459
>>> 7561 2.002753 5.159913 5.616194 1.0000000 -4.250013
>>> 7130 2.580527 3.863867 5.600047 1.0000000 -4.250405
>>> 19983 -2.1176246.836539 -5.567882 1.0000000 -4.251194
>>> So all genes have P.Value equal to 1!!!!!!
>>> in previous posts I read that this happen when you have to
consider
>>> multivariate test, which i don't known how to manage..., but
anyway
>>>
>>> 1) Am I doing something wrong in the design?
>>> 2) Am I doing something wrong in the subsequent evaluation steps?
>>> Any ideas
>>>
>>>
>>>
>>> Thank you to all
>>>
>>> Silvano
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Dr.Silvano Piazza
>>> LNCIB,
>>> Area Science Park,
>>> Padriciano 99
>>> Trieste, ITALY
>>> Tel. +39040398992
>>> Fax +39040398990
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor@stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>
>> Naomi S. Altman 814-865-3791 (voice)
>> Associate Professor
>> Bioinformatics Consulting Center
>> Dept. of Statistics 814-863-7114 (fax)
>> Penn State University 814-865-1348
(Statistics)
>> University Park, PA 16802-2111
>>
>>
>>
> Dr.Silvano Piazza
> LNCIB,
> Area Science Park,
> Padriciano 99
> Trieste, ITALY
> Tel. +39040398992
> Fax +39040398990