Significant p-values disappear in limma

0

Entering edit mode

michael watson IAH-C ★ 3.4k

@michael-watson-iah-c-378

Last seen 11.5 years ago

Hi Sorry to labour the point, but following on from my last mail, I have four arrays in a replicated dye swap experiment. After carrying out the analysis in limma, I find that 360 out of 4600 genes have an unadjusted p-value <= 0.05. However, when I adjust these using adjust="fdr", all of these disappear, and I have p-values of 0.5 and upwards. My B statistics seem much lower than in other analyses I have done, even though the t-statistics are still quite large, as are (some of) the M and A values. I was just wondering if anyone had seen this before and could shed some light on what this might say about my data. When the top gene from topTable() has log2 ratios of 4.11, 5.51, 3.53 and 4.3, yet has an adjusted p-value of 0.2790644 and a B value of only 1.080982225, I figure something must be badly wrong somewhere... Thanks in advance Mick

limma limma • 1.6k views

ADD COMMENT • link 21.1 years ago michael watson IAH-C ★ 3.4k

0

Entering edit mode

michael watson IAH-C ★ 3.4k

@michael-watson-iah-c-378

Last seen 11.5 years ago

Hi Sorry to labour the point, but following on from my last mail, I have four arrays in a replicated dye swap experiment. After carrying out the analysis in limma, I find that 360 out of 4600 genes have an unadjusted p-value <= 0.05. However, when I adjust these using adjust="fdr", all of these disappear, and I have p-values of 0.5 and upwards. My B statistics seem much lower than in other analyses I have done, even though the t-statistics are still quite large, as are (some of) the M and A values. I was just wondering if anyone had seen this before and could shed some light on what this might say about my data. When the top gene from topTable() has log2 ratios of 4.11, 5.51, 3.53 and 4.3, yet has an adjusted p-value of 0.2790644 and a B value of only 1.080982225, I figure something must be badly wrong somewhere... Thanks in advance Mick

ADD COMMENT • link 21.1 years ago michael watson IAH-C ★ 3.4k

0

Entering edit mode

It seems that with only two experiments (with accompanying dye-swaps), it is certainly possible that you don't have enough power to detect a difference. Can you do more experiments? Sean On Jan 5, 2005, at 6:58 AM, michael watson ((IAH-C)) wrote: > Hi > > Sorry to labour the point, but following on from my last mail, I have > four arrays in a replicated dye swap experiment. After carrying out > the > analysis in limma, I find that 360 out of 4600 genes have an unadjusted > p-value <= 0.05. However, when I adjust these using adjust="fdr", all > of these disappear, and I have p-values of 0.5 and upwards. My B > statistics seem much lower than in other analyses I have done, even > though the t-statistics are still quite large, as are (some of) the M > and A values. > > I was just wondering if anyone had seen this before and could shed some > light on what this might say about my data. When the top gene from > topTable() has log2 ratios of 4.11, 5.51, 3.53 and 4.3, yet has an > adjusted p-value of 0.2790644 and a B value of only 1.080982225, I > figure something must be badly wrong somewhere... > > Thanks in advance > > Mick > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor

ADD REPLY • link 21.1 years ago Sean Davis 21k

0

Entering edit mode

Well, 5% x 4600 = 230, so even a rough guess puts the FDR at 230/360 which is pretty high. Qvalues are the level FDR at which the gene becomes significant. If NsigA is the number of significa genes at level A, FDR is approximately A*4600/NsigA, so you need a relatively large value of NsigA to have a small Qvalue. A log2 ratio is 5 is not that huge in the scheme of things, especially with only 4 arrays. Why not have a look at the actual expression values on the arrays? --Naomi At 07:48 AM 1/5/2005 -0500, Sean Davis wrote: >It seems that with only two experiments (with accompanying dye- swaps), it >is certainly possible that you don't have enough power to detect a >difference. Can you do more experiments? > >Sean > >On Jan 5, 2005, at 6:58 AM, michael watson ((IAH-C)) wrote: > >>Hi >> >>Sorry to labour the point, but following on from my last mail, I have >>four arrays in a replicated dye swap experiment. After carrying out the >>analysis in limma, I find that 360 out of 4600 genes have an unadjusted >>p-value <= 0.05. However, when I adjust these using adjust="fdr", all >>of these disappear, and I have p-values of 0.5 and upwards. My B >>statistics seem much lower than in other analyses I have done, even >>though the t-statistics are still quite large, as are (some of) the M >>and A values. >> >>I was just wondering if anyone had seen this before and could shed some >>light on what this might say about my data. When the top gene from >>topTable() has log2 ratios of 4.11, 5.51, 3.53 and 4.3, yet has an >>adjusted p-value of 0.2790644 and a B value of only 1.080982225, I >>figure something must be badly wrong somewhere... >> >>Thanks in advance >> >>Mick >> >>_______________________________________________ >>Bioconductor mailing list >>Bioconductor@stat.math.ethz.ch >>https://stat.ethz.ch/mailman/listinfo/bioconductor > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Bioinformatics Consulting Center Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

ADD REPLY • link 21.1 years ago Naomi Altman ★ 6.0k

0

Entering edit mode

michael watson IAH-C ★ 3.4k

@michael-watson-iah-c-378

Last seen 11.5 years ago

Hi Sean Unfortunately this one is out of my control (as usual), but I have much smaller p-values with 4 arrays before, and even with 3 arrays. Also note that in one of my four-array experiments, EVERY single p-value was 0.9999963 after adjusting for the fdr - that's over 4600 spots, all with the same p-value. Finally, note that the SWIRL dataset has only 4 arrays and limma produces many, many p-values <= 0.05. So, although I admit 4 arrays is far from ideal in terms of power, something is nagging me that that's not it, and it certainly wouldn't explain why over 4600 spots all have the same adjusted p-value - would it? Cheers Mick -----Original Message----- From: Sean Davis [mailto:sdavis2@mail.nih.gov] Sent: 05 January 2005 12:49 To: michael watson (IAH-C) Cc: bioconductor@stat.math.ethz.ch Subject: Re: [BioC] Significant p-values disappear in limma It seems that with only two experiments (with accompanying dye-swaps), it is certainly possible that you don't have enough power to detect a difference. Can you do more experiments? Sean On Jan 5, 2005, at 6:58 AM, michael watson ((IAH-C)) wrote: > Hi > > Sorry to labour the point, but following on from my last mail, I have > four arrays in a replicated dye swap experiment. After carrying out > the analysis in limma, I find that 360 out of 4600 genes have an > unadjusted p-value <= 0.05. However, when I adjust these using > adjust="fdr", all of these disappear, and I have p-values of 0.5 and > upwards. My B statistics seem much lower than in other analyses I > have done, even though the t-statistics are still quite large, as are > (some of) the M and A values. > > I was just wondering if anyone had seen this before and could shed > some light on what this might say about my data. When the top gene > from > topTable() has log2 ratios of 4.11, 5.51, 3.53 and 4.3, yet has an > adjusted p-value of 0.2790644 and a B value of only 1.080982225, I > figure something must be badly wrong somewhere... > > Thanks in advance > > Mick > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor

ADD COMMENT • link 21.1 years ago michael watson IAH-C ★ 3.4k

0

Entering edit mode

Have you run p.adjust on the p-values from limma? I think limma uses p.adjust directly, so you can check (for fun) to see that you get the same results. Power to detect differentially-expressed genes is NOT just a function of the number of arrays, but also of the experiment. (In other words, just because Gordon shows finding differentially expressed genes in the swirl experiment with four arrays doesn't mean that all experiments with only four arrays have enough power to detect small differences.) You could increase the number of experiments or you could just accept that, while not statistically certain of your gene list, it probably represents a good ordering and then proceed with your confirmatory experiments based on the ordering. Did you consider using SAM (in siggenes), just to see if that gets you more (not likely, but...)? On Jan 5, 2005, at 8:07 AM, michael watson ((IAH-C)) wrote: > Hi Sean > > Unfortunately this one is out of my control (as usual), but I have much > smaller p-values with 4 arrays before, and even with 3 arrays. Also > note that in one of my four-array experiments, EVERY single p-value was > 0.9999963 after adjusting for the fdr - that's over 4600 spots, all > with > the same p-value. > > Finally, note that the SWIRL dataset has only 4 arrays and limma > produces many, many p-values <= 0.05. > > So, although I admit 4 arrays is far from ideal in terms of power, > something is nagging me that that's not it, and it certainly wouldn't > explain why over 4600 spots all have the same adjusted p-value - would > it? You are positing a bug in limma? Like I mentioned, try running the p-values from limma through p.adjust. Alternatively, try using the qvalue package, just to see what you get. But, yes, I have seen the majority (I don't think all) of my genes have the same large p-value. Sean

ADD REPLY • link 21.1 years ago Sean Davis 21k

0

Entering edit mode

michael watson (IAH-C) wrote: > Hi Sean > > Unfortunately this one is out of my control (as usual), but I have much > smaller p-values with 4 arrays before, and even with 3 arrays. Also > note that in one of my four-array experiments, EVERY single p-value was > 0.9999963 after adjusting for the fdr - that's over 4600 spots, all with > the same p-value. Mick, I see this sort of thing all the time, and what it means is that you don't have any evidence for differential expression between your two groups. One of the things I do to check the quality of a given set of data is a principal components analysis. A plot of the first two PCs is usually a very good indication of how well your downstream analysis is going to turn out. For instance, I just looked at 17 Affy chips from three different sample types. Only one group clustered together on a PCA plot, and this cluster was within a larger cluster of the other two groups (in other words, the different groups did not cluster separately). I knew from this that I would not be able to show any differential expression, and when I did the statistics my smallest adjusted p-value was something like 0.5. I bet if you did a PCA with your data you will see something very similar. Best, Jim > > Finally, note that the SWIRL dataset has only 4 arrays and limma > produces many, many p-values <= 0.05. > > So, although I admit 4 arrays is far from ideal in terms of power, > something is nagging me that that's not it, and it certainly wouldn't > explain why over 4600 spots all have the same adjusted p-value - would > it? > > Cheers > Mick > > -----Original Message----- > From: Sean Davis [mailto:sdavis2@mail.nih.gov] > Sent: 05 January 2005 12:49 > To: michael watson (IAH-C) > Cc: bioconductor@stat.math.ethz.ch > Subject: Re: [BioC] Significant p-values disappear in limma > > > It seems that with only two experiments (with accompanying dye- swaps), > it is certainly possible that you don't have enough power to detect a > difference. Can you do more experiments? > > Sean > > On Jan 5, 2005, at 6:58 AM, michael watson ((IAH-C)) wrote: > > >>Hi >> >>Sorry to labour the point, but following on from my last mail, I have >>four arrays in a replicated dye swap experiment. After carrying out >>the analysis in limma, I find that 360 out of 4600 genes have an >>unadjusted p-value <= 0.05. However, when I adjust these using >>adjust="fdr", all of these disappear, and I have p-values of 0.5 and >>upwards. My B statistics seem much lower than in other analyses I >>have done, even though the t-statistics are still quite large, as are >>(some of) the M and A values. >> >>I was just wondering if anyone had seen this before and could shed >>some light on what this might say about my data. When the top gene >>from >>topTable() has log2 ratios of 4.11, 5.51, 3.53 and 4.3, yet has an >>adjusted p-value of 0.2790644 and a B value of only 1.080982225, I >>figure something must be badly wrong somewhere... >> >>Thanks in advance >> >>Mick >> >>_______________________________________________ >>Bioconductor mailing list >>Bioconductor@stat.math.ethz.ch >>https://stat.ethz.ch/mailman/listinfo/bioconductor > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor -- James W. MacDonald Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109

ADD REPLY • link 21.1 years ago James W. MacDonald 68k

Login before adding your answer.