Significant p-values disappear in limma
2
0
Entering edit mode
@michael-watson-iah-c-378
Last seen 9.6 years ago
Hi Sorry to labour the point, but following on from my last mail, I have four arrays in a replicated dye swap experiment. After carrying out the analysis in limma, I find that 360 out of 4600 genes have an unadjusted p-value <= 0.05. However, when I adjust these using adjust="fdr", all of these disappear, and I have p-values of 0.5 and upwards. My B statistics seem much lower than in other analyses I have done, even though the t-statistics are still quite large, as are (some of) the M and A values. I was just wondering if anyone had seen this before and could shed some light on what this might say about my data. When the top gene from topTable() has log2 ratios of 4.11, 5.51, 3.53 and 4.3, yet has an adjusted p-value of 0.2790644 and a B value of only 1.080982225, I figure something must be badly wrong somewhere... Thanks in advance Mick
limma limma • 1.2k views
ADD COMMENT
0
Entering edit mode
@michael-watson-iah-c-378
Last seen 9.6 years ago
Hi Sorry to labour the point, but following on from my last mail, I have four arrays in a replicated dye swap experiment. After carrying out the analysis in limma, I find that 360 out of 4600 genes have an unadjusted p-value <= 0.05. However, when I adjust these using adjust="fdr", all of these disappear, and I have p-values of 0.5 and upwards. My B statistics seem much lower than in other analyses I have done, even though the t-statistics are still quite large, as are (some of) the M and A values. I was just wondering if anyone had seen this before and could shed some light on what this might say about my data. When the top gene from topTable() has log2 ratios of 4.11, 5.51, 3.53 and 4.3, yet has an adjusted p-value of 0.2790644 and a B value of only 1.080982225, I figure something must be badly wrong somewhere... Thanks in advance Mick
ADD COMMENT
0
Entering edit mode
It seems that with only two experiments (with accompanying dye-swaps), it is certainly possible that you don't have enough power to detect a difference. Can you do more experiments? Sean On Jan 5, 2005, at 6:58 AM, michael watson ((IAH-C)) wrote: > Hi > > Sorry to labour the point, but following on from my last mail, I have > four arrays in a replicated dye swap experiment. After carrying out > the > analysis in limma, I find that 360 out of 4600 genes have an unadjusted > p-value <= 0.05. However, when I adjust these using adjust="fdr", all > of these disappear, and I have p-values of 0.5 and upwards. My B > statistics seem much lower than in other analyses I have done, even > though the t-statistics are still quite large, as are (some of) the M > and A values. > > I was just wondering if anyone had seen this before and could shed some > light on what this might say about my data. When the top gene from > topTable() has log2 ratios of 4.11, 5.51, 3.53 and 4.3, yet has an > adjusted p-value of 0.2790644 and a B value of only 1.080982225, I > figure something must be badly wrong somewhere... > > Thanks in advance > > Mick > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor
ADD REPLY
0
Entering edit mode
Well, 5% x 4600 = 230, so even a rough guess puts the FDR at 230/360 which is pretty high. Qvalues are the level FDR at which the gene becomes significant. If NsigA is the number of significa genes at level A, FDR is approximately A*4600/NsigA, so you need a relatively large value of NsigA to have a small Qvalue. A log2 ratio is 5 is not that huge in the scheme of things, especially with only 4 arrays. Why not have a look at the actual expression values on the arrays? --Naomi At 07:48 AM 1/5/2005 -0500, Sean Davis wrote: >It seems that with only two experiments (with accompanying dye- swaps), it >is certainly possible that you don't have enough power to detect a >difference. Can you do more experiments? > >Sean > >On Jan 5, 2005, at 6:58 AM, michael watson ((IAH-C)) wrote: > >>Hi >> >>Sorry to labour the point, but following on from my last mail, I have >>four arrays in a replicated dye swap experiment. After carrying out the >>analysis in limma, I find that 360 out of 4600 genes have an unadjusted >>p-value <= 0.05. However, when I adjust these using adjust="fdr", all >>of these disappear, and I have p-values of 0.5 and upwards. My B >>statistics seem much lower than in other analyses I have done, even >>though the t-statistics are still quite large, as are (some of) the M >>and A values. >> >>I was just wondering if anyone had seen this before and could shed some >>light on what this might say about my data. When the top gene from >>topTable() has log2 ratios of 4.11, 5.51, 3.53 and 4.3, yet has an >>adjusted p-value of 0.2790644 and a B value of only 1.080982225, I >>figure something must be badly wrong somewhere... >> >>Thanks in advance >> >>Mick >> >>_______________________________________________ >>Bioconductor mailing list >>Bioconductor@stat.math.ethz.ch >>https://stat.ethz.ch/mailman/listinfo/bioconductor > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Bioinformatics Consulting Center Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD REPLY
0
Entering edit mode
@michael-watson-iah-c-378
Last seen 9.6 years ago
Hi Sean Unfortunately this one is out of my control (as usual), but I have much smaller p-values with 4 arrays before, and even with 3 arrays. Also note that in one of my four-array experiments, EVERY single p-value was 0.9999963 after adjusting for the fdr - that's over 4600 spots, all with the same p-value. Finally, note that the SWIRL dataset has only 4 arrays and limma produces many, many p-values <= 0.05. So, although I admit 4 arrays is far from ideal in terms of power, something is nagging me that that's not it, and it certainly wouldn't explain why over 4600 spots all have the same adjusted p-value - would it? Cheers Mick -----Original Message----- From: Sean Davis [mailto:sdavis2@mail.nih.gov] Sent: 05 January 2005 12:49 To: michael watson (IAH-C) Cc: bioconductor@stat.math.ethz.ch Subject: Re: [BioC] Significant p-values disappear in limma It seems that with only two experiments (with accompanying dye-swaps), it is certainly possible that you don't have enough power to detect a difference. Can you do more experiments? Sean On Jan 5, 2005, at 6:58 AM, michael watson ((IAH-C)) wrote: > Hi > > Sorry to labour the point, but following on from my last mail, I have > four arrays in a replicated dye swap experiment. After carrying out > the analysis in limma, I find that 360 out of 4600 genes have an > unadjusted p-value <= 0.05. However, when I adjust these using > adjust="fdr", all of these disappear, and I have p-values of 0.5 and > upwards. My B statistics seem much lower than in other analyses I > have done, even though the t-statistics are still quite large, as are > (some of) the M and A values. > > I was just wondering if anyone had seen this before and could shed > some light on what this might say about my data. When the top gene > from > topTable() has log2 ratios of 4.11, 5.51, 3.53 and 4.3, yet has an > adjusted p-value of 0.2790644 and a B value of only 1.080982225, I > figure something must be badly wrong somewhere... > > Thanks in advance > > Mick > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor
ADD COMMENT
0
Entering edit mode
Have you run p.adjust on the p-values from limma? I think limma uses p.adjust directly, so you can check (for fun) to see that you get the same results. Power to detect differentially-expressed genes is NOT just a function of the number of arrays, but also of the experiment. (In other words, just because Gordon shows finding differentially expressed genes in the swirl experiment with four arrays doesn't mean that all experiments with only four arrays have enough power to detect small differences.) You could increase the number of experiments or you could just accept that, while not statistically certain of your gene list, it probably represents a good ordering and then proceed with your confirmatory experiments based on the ordering. Did you consider using SAM (in siggenes), just to see if that gets you more (not likely, but...)? On Jan 5, 2005, at 8:07 AM, michael watson ((IAH-C)) wrote: > Hi Sean > > Unfortunately this one is out of my control (as usual), but I have much > smaller p-values with 4 arrays before, and even with 3 arrays. Also > note that in one of my four-array experiments, EVERY single p-value was > 0.9999963 after adjusting for the fdr - that's over 4600 spots, all > with > the same p-value. > > Finally, note that the SWIRL dataset has only 4 arrays and limma > produces many, many p-values <= 0.05. > > So, although I admit 4 arrays is far from ideal in terms of power, > something is nagging me that that's not it, and it certainly wouldn't > explain why over 4600 spots all have the same adjusted p-value - would > it? You are positing a bug in limma? Like I mentioned, try running the p-values from limma through p.adjust. Alternatively, try using the qvalue package, just to see what you get. But, yes, I have seen the majority (I don't think all) of my genes have the same large p-value. Sean
ADD REPLY
0
Entering edit mode
michael watson (IAH-C) wrote: > Hi Sean > > Unfortunately this one is out of my control (as usual), but I have much > smaller p-values with 4 arrays before, and even with 3 arrays. Also > note that in one of my four-array experiments, EVERY single p-value was > 0.9999963 after adjusting for the fdr - that's over 4600 spots, all with > the same p-value. Mick, I see this sort of thing all the time, and what it means is that you don't have any evidence for differential expression between your two groups. One of the things I do to check the quality of a given set of data is a principal components analysis. A plot of the first two PCs is usually a very good indication of how well your downstream analysis is going to turn out. For instance, I just looked at 17 Affy chips from three different sample types. Only one group clustered together on a PCA plot, and this cluster was within a larger cluster of the other two groups (in other words, the different groups did not cluster separately). I knew from this that I would not be able to show any differential expression, and when I did the statistics my smallest adjusted p-value was something like 0.5. I bet if you did a PCA with your data you will see something very similar. Best, Jim > > Finally, note that the SWIRL dataset has only 4 arrays and limma > produces many, many p-values <= 0.05. > > So, although I admit 4 arrays is far from ideal in terms of power, > something is nagging me that that's not it, and it certainly wouldn't > explain why over 4600 spots all have the same adjusted p-value - would > it? > > Cheers > Mick > > -----Original Message----- > From: Sean Davis [mailto:sdavis2@mail.nih.gov] > Sent: 05 January 2005 12:49 > To: michael watson (IAH-C) > Cc: bioconductor@stat.math.ethz.ch > Subject: Re: [BioC] Significant p-values disappear in limma > > > It seems that with only two experiments (with accompanying dye- swaps), > it is certainly possible that you don't have enough power to detect a > difference. Can you do more experiments? > > Sean > > On Jan 5, 2005, at 6:58 AM, michael watson ((IAH-C)) wrote: > > >>Hi >> >>Sorry to labour the point, but following on from my last mail, I have >>four arrays in a replicated dye swap experiment. After carrying out >>the analysis in limma, I find that 360 out of 4600 genes have an >>unadjusted p-value <= 0.05. However, when I adjust these using >>adjust="fdr", all of these disappear, and I have p-values of 0.5 and >>upwards. My B statistics seem much lower than in other analyses I >>have done, even though the t-statistics are still quite large, as are >>(some of) the M and A values. >> >>I was just wondering if anyone had seen this before and could shed >>some light on what this might say about my data. When the top gene >>from >>topTable() has log2 ratios of 4.11, 5.51, 3.53 and 4.3, yet has an >>adjusted p-value of 0.2790644 and a B value of only 1.080982225, I >>figure something must be badly wrong somewhere... >> >>Thanks in advance >> >>Mick >> >>_______________________________________________ >>Bioconductor mailing list >>Bioconductor@stat.math.ethz.ch >>https://stat.ethz.ch/mailman/listinfo/bioconductor > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor -- James W. MacDonald Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109
ADD REPLY

Login before adding your answer.

Traffic: 388 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6