Entering edit mode
Johan Lindberg
▴
270
@johan-lindberg-815
Last seen 10.2 years ago
Hi Mark, thanks for the reply.
> 1. Does samr produce an unadjusted P-value. If so, are these
> unadjusted P-vals comparable in magnitude to those from limma? an xy
> plot using log="xy" might be a good way of looking at this. If they
> are comparable in magnitude, then its definitely the FDR procedure
> which is producing the marked differences.
No it doesnt, I guess there is no point of reporting an unadjusted p-
value since the significance is directly related to the fdr via the
permutation. I could calculate it but they dont report on how they
calculate the degrees of freedom for their test, so I guess I could
just use the standard settings for a t-test but this would just
reflect the similarity that are there between the t-scores.
> 2. can you reproduce the permutation-based FDR method using the
limma
> data, perhaps using the fit$coefficients object to obtain the data
> from the 2 groups. In this case do your permuted limma FDR's agree
> more closely with the samr FDR's.
Yes I guess I could do that but this would give the same results for
samr, which is basically my question. Which way is the way to do it?
Is the Benjamini fdr more conservative than permutation based methods
in general or is it depending on the particular dataset? I guess the
answer is that there is no right or wrong, just stick with the method
that you prefer, which for me, in this case, is samr, since it gives
me some genes to work with and at least when checking some of them,
they make sense. But it would be nice to hear a level 70 statistician
(WOW analogy) comment on this issue.
best regards
// Johan
On 23 jul 2008, at 01.19, Mark Cowley wrote:
> Hi Johan,
> I haven't used samr before, so I am intrigued by your findings.
> It seems to me, given the close agreement in t-stats and rankings,
> that the major difference is in the FDR estimation: BH vs
permutation.
> 1. Does samr produce an unadjusted P-value. If so, are these
> unadjusted P-vals comparable in magnitude to those from limma? an xy
> plot using log="xy" might be a good way of looking at this. If they
> are comparable in magnitude, then its definitely the FDR procedure
> which is producing the marked differences.
> 2. can you reproduce the permutation-based FDR method using the
limma
> data, perhaps using the fit$coefficients object to obtain the data
> from the 2 groups. In this case do your permuted limma FDR's agree
> more closely with the samr FDR's.
>
> cheers,
> Mark
>
> -----------------------------------------------------
> Mark Cowley, BSc (Bioinformatics)(Hons)
>
> Peter Wills Bioinformatics Centre
> Garvan Institute of Medical Research, Sydney, Australia
> -----------------------------------------------------
>
> On 23/07/2008, at 1:55 AM, Johan Lindberg wrote:
>
>>
>> Dear all, I noticed that the attachments are stripped off in the
>> mailing list so I posted them on http://picasaweb.google.co.uk/
>> hurrayarray/FDRDiscrepancy
>>
>> best regards,
>>
>> // Johan
>>
>>
>> On 22 jul 2008, at 15.39, Johan Lindberg wrote:
>>
>>> Dear all.
>>>
>>> I have used Limma to analyse my data and I got no differentially
>>> expressed genes correcting with multiple testing using fdr. Then a
>>> colleague of mine analysed the data and used samr, the package by
>>> the
>>> guys from Stanford, http://www-stat.stanford.edu/~tibs/SAM/ and
got
>>> ~600 genes with an fdr < 0.05.
>>>
>>> I immediately thought I had done something wrong in my Limma
>>> analysis
>>> and double checked a zillion times but I couldn't find any errors.
>>> When I compared the ranking for the 1000 most differentially
>>> expressed genes of Limma and samr they look very similar with few
>>> discrepancies. I attached a picture of the ranking. Then I
compared
>>> the t-scores for the same genes and they were also almost the
>>> same. I
>>> also attached an image of that. The scores for Limma are a little
>>> bit
>>> more significant.
>>>
>>> Its when I do the adjustment for multiple testing that I get
>>> differences (I attached another picture). As I understand it the
fdr
>>> level for a certain delta-cutoff is in samr approximated by
balanced
>>> permutations of the two groups. Thereby one can find out the
median
>>> number of differentially expressed genes comparing the two groups
in
>>> e.g. 100 permutations which is the fdr for that level. In Limma
the
>>> general fdr definition by Benjamini & Hochberg is used if I
>>> understand it right.
>>>
>>> I was really surprised that I got so different results using the
>>> same
>>> correction for multiple testing, whereas one is based on
permutation
>>> of the same data and the other is based on the old Benjamini
>>> definition. What is more correct in this situation? I would really
>>> appreciate if someone could give me some advice.
>>>
>>> Best regards,
>>>
>>> Johan
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> sessionInfo()
>>> R version 2.7.0 (2008-04-22)
>>> i386-apple-darwin8.10.1
>>>
>>> locale:
>>> sv_SE.UTF-8/sv_SE.UTF-8/C/C/sv_SE.UTF-8/sv_SE.UTF-8
>>>
>>> attached base packages:
>>> [1] splines tools stats graphics grDevices utils
>>> datasets
>>> [8] methods base
>>>
>>> other attached packages:
>>> [1] samr_1.25 impute_1.0-5 AnnBuilder_1.18.0
>>> [4] XML_1.95-2 siggenes_1.14.0 multtest_1.20.0
>>> [7] survival_2.34-1 KTH.hsOligo.db_1.0.0 hsOligo_2.0.1
>>> [10] kth_1.2.1 geneplotter_1.18.0 lattice_0.17-6
>>> [13] aroma_0.94 R.io_0.37 R.graphics_0.42
>>> [16] R.colors_0.5.3 R.basic_0.49 aroma.light_1.8.1
>>> [19] R.utils_1.0.2 R.oo_1.4.3 R.methodsS3_1.0.1
>>> [22] limma_2.14.5 annotate_1.18.0 xtable_1.5-2
>>> [25] AnnotationDbi_1.2.2 RSQLite_0.6-9 DBI_0.2-4
>>> [28] Biobase_2.0.1
>>>
>>> loaded via a namespace (and not attached):
>>> [1] KernSmooth_2.22-22 RColorBrewer_1.0-2 grid_2.7.0
>>>
>>>
>>> *********************************************
>>> Johan Lindberg
>>> Royal Institute of Technology
>>> AlbaNova University Center
>>> Stockholm Center for Physics, Astronomy and Biotechnology
>>> School of Molecular Biotechnology
>>> Department of Gene Technology
>>> Visiting address:
>>> Roslagstullsbacken 21, Floor 3
>>> 106 91 Stockholm, Sweden
>>> Delivering address:
>>> Roslagsvägen 30 B
>>> 104 06 Stockholm, Sweden
>>> Phone (office) +46 8 553 783 44
>>> Fax + 46 8 553 784 81
>>> http://www.ktharray.se/
>>> http://www.arrayadvice.se/
>>> *********************************************
>>>
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor@stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/
>>> gmane.science.biology.informatics.conductor
>>
>> *********************************************
>> Johan Lindberg
>> Royal Institute of Technology
>> AlbaNova University Center
>> Stockholm Center for Physics, Astronomy and Biotechnology
>> School of Molecular Biotechnology
>> Department of Gene Technology
>> Visiting address:
>> Roslagstullsbacken 21, Floor 3
>> 106 91 Stockholm, Sweden
>> Delivering address:
>> Roslagsvägen 30 B
>> 104 06 Stockholm, Sweden
>> Phone (office) +46 8 553 783 44
>> Fax + 46 8 553 784 81
>> http://www.ktharray.se/
>> http://www.arrayadvice.se/
>> *********************************************
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor@stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/
>> gmane.science.biology.informatics.conductor
>
*********************************************
Johan Lindberg
Royal Institute of Technology
AlbaNova University Center
Stockholm Center for Physics, Astronomy and Biotechnology
School of Molecular Biotechnology
Department of Gene Technology
Visiting address:
Roslagstullsbacken 21, Floor 3
106 91 Stockholm, Sweden
Delivering address:
Roslagsvägen 30 B
104 06 Stockholm, Sweden
Phone (office) +46 8 553 783 44
Fax + 46 8 553 784 81
http://www.ktharray.se/
http://www.arrayadvice.se/
*********************************************
[[alternative HTML version deleted]]