Question: statistics for differential expression: adjusted p-values<0.05 BUT negative B-odds?
Dear BioConductor mailing list! I am using the R-2.7.2, the limma package and its interface limmaGUI. I have a rather small number of slides (6) and probes spotted in 4 replicates (776x4=3104 spots). I perform background correction (normexp, cutoff=10), within (global loess) and between (Scale) array normalization. To obtain the statistics for differential expression I choose the "least squares" linear model fit and the calculation of Duplicate correlation, the adjust method for p-value is "BH". I get the following toptable: *Name* *P.Value* *logFC* *AveExpr* *t* *P.Value* *adj.P.Val* *B* hsa-miR-503 2.34E-06 1.945387328 8.136697759 5.884386792 2.34E-06 0.001343352 4.836590302 hsa-miR-921 3.46E-06 1.174433413 8.035845865 5.740377619 3.46E-06 0.001343352 4.471051142 hsa-miR-30c-2* 7.77E-06 3.117701794 8.27088038 5.445618379 7.77E-06 0.001550508 3.718681784 hsa-miR-198 7.99E-06 1.967637234 6.072614871 5.435302824 7.99E-06 0.001550508 3.692268736 miRPlus_42526 1.98E-05 3.697849577 9.172222106 5.105807836 1.98E-05 0.00307328 2.846724551 hsa-miR-665 2.55E-05 2.066123655 8.328358693 5.014730934 2.55E-05 0.003292112 2.612641297 hsa- miR-371-5p 6.28E-05 3.733502603 9.49054759 4.686906149 6.28E-05 0.006817575 1.770685081 hsa-miR-187* 7.03E-05 2.265561547 5.963832035 4.646081978 7.03E-05 0.006817575 1.666045526 hsa-miR-183* 9.64E-05 1.416875368 6.088668377 4.531017428 9.64E-05 0.008313829 1.371557327 hsa-miR-483-5p 0.000111163 1.789850517 6.054104314 4.479175715 0.000111163 0.008626216 1.239133367 hsa-miR-30b* 0.000123606 2.526268106 8.483475515 4.440467862 0.000123606 0.008719852 1.140379763 hsa-miR-620 0.00028076 1.84948372 8.448990163 4.139779212 0.00028076 0.018155808 0.377815095 miRPlus_17952 0.000337227 4.962342262 8.633804566 4.07218339 0.000337227 0.020129836 0.20777931 hsa-miR-675 0.000578682 1.991150452 5.45522505 3.871788436 0.000578682 0.028873358 -0.292416594 miRPlus_17869 0.000587344 1.280048919 8.085959229 3.866246126 0.000587344 0.028873358 -0.306158062 miRPlus_42793 0.000612763 1.315186076 6.393831483 3.850431131 0.000612763 0.028873358 -0.345339636 hsa-miR-193a-5p 0.000632535 1.418128062 7.751168126 3.838568191 0.000632535 0.028873358 -0.374700751 miRPlus_42487 0.00079838 5.437939056 10.51144667 3.751332648 0.00079838 0.033174042 -0.589807882 hsa-miR-637 0.000812251 1.586253029 5.38574605 3.744861076 0.000812251 0.033174042 -0.605707122 According to the book, p-values and B-statistics should rank genes in the same order. As possible treshhold for adjusted p-values <0.05, B-value of 0 expresses a 50:50 chance that its really differentially expressed, a negative B-value expresses a very very unlikey probability of differential expression. What makes me worry is that in my statistics I have low adj.p-value 0.03 together with negative B-values. How do I have to handle this discrepance? Is this a hint that something is wrong with my normalization? I performed validation by real-time PCR some time ago, at that time I considered only the p-value (and not adjusted .p or the B), using a treshhold of p<0.001. Now I checked these old results once again, to understand if positive B and adjusted p-values<0.05 in my case indiciated a high probability for modulated expression. It was true only for some mirRs (adjusted p-value < 0.05 an B positive) where modulation was confirmedby real time PCR. In contrast to other mirs (with adj.p > 0.05 and negative B) which showed modulation in real-time PCR. And yet other mir showed very good adjusted p and the B (adjp= 0.014, B=1.71)- but no modulation real-time. Are adjusted p-value and B-statistics too stringent or do I have to reconsider normlization and linear model fit? Do I expect too much from the Statistics? Grazie! Christine Dr. Christine Völlenkle, Ph.D. Research Laboratories-Molecular Cardiology I.R.C.C.S. Policlinico San Donato Via R. Morandi, 30 20097 S. Donato (MI) Italy Phone: +39 02 52774 683 (lab) +39 02 52774 533 (office) Fax: +39 02 52774 666 email: [[alternative HTML version deleted]]
> What makes me worry is that in my statistics I have low adj.p-value 0.03 > together with negative B-values. > How do I have to handle this discrepance? Is this a hint that something is > wrong with my normalization? In the limma userguide they also say: "The B-statistic is automatically adjusted for multiple testing by assuming that 1% of the genes, or some other percentage specified by the user in the call to eBayes(), are expected to be differentially expressed." and "The B-statistic probabilities depend on the same assumptions but require in addition a prior guess for the proportion of differentially expressed genes." So I think your problem is that the number of differentially expressed genes in your experiment is higher that the proportion eBayes assumes (proportion=0.01) Try to specify it in eBayes like this: fit <- eBayes(fit, proportion= <proportion of="" d.e.="" genes="">) and see if you get an improvement, and look at ?eBayes for more information. I never used limmaGUI though, so I don't know how to do it with this interface. I hope this helps. Best, paolo -- Paolo Innocenti Department of Animal Ecology, EBC Uppsala University Norbyv?gen 18D 75236 Uppsala, Sweden
