Entering edit mode
Hi everybody,
I have a question about the behavior of siggenes in R (2.12) with
different
parameters.
Maybe I need to understand better how siggenes calculated the
differentially
regulated genes, but I would like to ask you for your help in
understanding
it.
Maybe you can also direct me to where I can find the answers for this
problem(s)
I am running a sam analysis of wild type vs. mutant in drosophila
miRNA
microarrays.
I first read the files into an affybatch and than normalized them
(RMA).
After normalizing the arrays I extracted all non-drosophila miRNA from
my
expressionSet and ran the sam analysis only with the 186 dme-miRNA
probes.
cl <- c(0,0,0,1,1,1) # "wt1", "wt2", "wt3", "mt1", "mt2", "mt3"
sam.out <- sam(dme_rma,cl, var.equal=FALSE, B=100, include.zero=FALSE,
gene.names=Probe_names, na.replace=FALSE, rand=123)
I run the sam analysis with different parameters and got *very*
different
results.
the first run was with the default parameters. Here I got these
results:
SAM Analysis for the Two-Class Unpaired Case Assuming Unequal
Variances
s0 = 0.3176 (The 55 % quantile of the s values.)
Delta p0 False Called FDR cutlow cutup j2 j1
1 0.10 0.882 74.85 92 0.717 -0.130 2.135 91 186
2 0.11 0.882 74.85 92 0.717 -0.130 2.135 91 186
3 0.12 0.882 28.25 39 0.639 -0.575 2.135 38 186
4 0.13 0.882 24.35 38 0.565 -0.633 2.135 37 186
...
10 0.19 0.882 21.65 37 0.516 -0.672 2.135 36 186
11 0.20 0.882 12.4 24 0.456 -0.860 2.135 23 186
12 0.21 0.882 12.4 24 0.456 -0.860 2.135 23 186
13 0.22 0.882 12.4 24 0.456 -0.860 2.135 23 186
14 0.23 0.882 10.9 22 0.437 -0.905 2.135 21 186
15 0.24 0.882 0.4 1 0.353 -Inf 2.135 0 186
...
20 0.29 0.882 0.4 1 0.353 -Inf 2.135 0 186
21 0.30 0.882 0 0 0 -Inf Inf 0 187
22 0.31 0.882 0 0 0 -Inf Inf 0 187
I get here a very high FDR value for very few miRNAs.
For my second run I added the parameter R.fold = 1.2, to exclude all
miRNAs
under this value from the analysis.
Here I get much better results with lower FDR values:
Number of variables having a fold change >= 1.2 or <= 0.8333 : 90
s0 = 0.0477 (The 0 % quantile of the s values.)
Delta p0 False Called FDR cutlow cutup j2 j1
1 0.1 0.378 55.5 82 0.2557 -0.708 0.425 45 54
2 0.2 0.378 52.25 80 0.2467 -0.708 0.562 45 56
3 0.3 0.378 48.4 79 0.2314 -0.708 0.676 45 57
4 0.5 0.378 24.9 48 0.1960 -0.708 2.504 45 88
5 0.6 0.378 23.7 45 0.1990 -0.708 Inf 45 91
6 0.7 0.378 10.65 31 0.1298 -1.170 Inf 31 91
7 0.8 0.378 6.75 25 0.1020 -1.513 Inf 25 91
8 1.0 0.378 0.85 4 0.0803 -2.712 Inf 4 91
9 1.1 0.378 0.25 1 0.0944 -4.810 Inf 1 91
10 1.2 0.378 0 0 0 -Inf Inf 0 91
I than ran a third sam with the parameter use.dm=FALSE. This gave me
again
different results with much lower FDR values:
Number of variables having a fold change >= 1.2 or <= 0.8333 : 97
s0 = 0.1488 (The 5 % quantile of the s values.)
Delta p0 False Called FDR cutlow cutup j2 j1
1 0.1 0.009 53.5 81 0.00612 -0.541 0.545 48 65
2 0.2 0.009 50.65 80 0.00587 -0.541 0.616 48 66
3 0.3 0.009 26.85 48 0.00519 -0.541 Inf 48 98
4 0.4 0.009 26.85 48 0.00519 -0.541 Inf 48 98
5 0.5 0.009 26.85 48 0.00519 -0.541 Inf 48 98
6 0.6 0.009 0 0 0 -Inf Inf 0 98
7 0.7 0.009 0 0 0 -Inf Inf 0 98
8 0.8 0.009 0 0 0 -Inf Inf 0 98
9 0.9 0.009 0 0 0 -Inf Inf 0 98
10 1.0 0.009 0 0 0 -Inf Inf 0 98
As you can see my obvious problem. Which is the better (right???) way
of
running this analysis.
I think, to understand that, one needs to know how SAM works. I don't
understand why I get such different results in the FDR between the
three
possibilities.
As far as I understand it, the lower the FDR value, the better. But
how can
it be, that I have in the third run over 80 miRNAs with such a low
FDR,
while I have only few DE miRNAs in the first run with a much higher
FDR
value?
I would appreciate any help,
thanks
Assa
[[alternative HTML version deleted]]