siggenes fc threshold
1
0
Entering edit mode
John Lande ▴ 280
@john-lande-2357
Last seen 9.7 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20071211/ b93afae0/attachment.pl
• 680 views
ADD COMMENT
0
Entering edit mode
@holger-schwender-344
Last seen 9.7 years ago
Hi John, I am not sure, but this might be due to the fact that in siggenes the fold change is used to filter out genes prior to the actual SAM analysis. Thus, only the permuted values of the test statistics for the remaining genes, i.e. genes with a fold change larger than R.fold (or smaller than 1/R.fold), are used to estimate the null distribution and to compute d.bar, i.e. the values of the test statistic expected under the null, instead of using the permuted values of all genes. This might lead to these strange results. Best, Holger -------- Original-Nachricht -------- > Datum: Tue, 11 Dec 2007 19:20:25 +0100 > Von: "John Lande" <john.lande77 at="" gmail.com=""> > An: bioconductor at stat.math.ethz.ch > Betreff: [BioC] siggenes fc threshold > dear biocoductors, > > I want to use siggenes, and sam to find differentially regulated genes, > but > I have problems with siggenes function, and possibly didn't understand > properly something. > here I will report an example that emulate the problem: > > library(siggenes) > data(golub) > sam.out1 <- sam(golub, golub.cl, rand = 123, gene.names = > golub.gnames[,3], > med=TRUE, lambda=.5,method=d.stat, B=5, R.fold=1, delta=seq(0.01, 3, 0.5)) > sam.out2 <- sam(golub, golub.cl, rand = 123, gene.names = > golub.gnames[,3], > med=TRUE, lambda=.5,method=d.stat, B=5, R.fold=2, delta=seq(0.01, 3, 0.5)) > > I use the parameter R.fold to set the minimum FC I want for my list of > significant genes. > the problem is this: when I launch > > > sam.out1 > SAM Analysis for the Two-Class Unpaired Case Assuming Unequal Variances > > Delta p0 False Called FDR > 1 0.01 0.519 2950 3007 0.50933 > 2 0.51 0.519 478 1638 0.15151 > 3 1.01 0.519 38 839 0.02351 > 4 1.51 0.519 1 380 0.00137 > 5 2.01 0.519 0 159 0 > 6 2.51 0.519 0 74 0 > > > sam.out2 > SAM Analysis for the Two-Class Unpaired Case Assuming Unequal Variances > > Delta p0 False Called FDR > 1 0.01 0 17 166 0 > 2 0.51 0 17 166 0 > 3 1.01 0 12 164 0 > 4 1.51 0 3 163 0 > 5 2.01 0 1 161 0 > 6 2.51 0 0 155 0 > > you can see that the sam with higher FC with a delta of 2.51 has an higher > number of significant genes than the one with 1. to me does not make much > sense. > by the way I also tried to use sam in excel and I don't have the same > problems. furthermore the dynamic range of delta is much lower. do you > have > any idea? > > what do I do wrong? > > best regards > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor --
ADD COMMENT
0
Entering edit mode
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20071212/ 90064400/attachment.pl
ADD REPLY
0
Entering edit mode
Hi John, John Lande wrote: > I see your points. but the strange thing is that actually > >> sam.out1 > Delta p0 False Called FDR > 5 2.01 0.519 0 159 0 >> sam.out2 > Delta p0 False Called FDR > 5 2.01 0 1 161 0 > > do you see? same parameter of delta, but an higer number of significant > genes fcor higer FC. this make no sense to me! we are not speaking about > FDR. but of crude results from the test! Yes, but you have to think about what Holger has told you, and about the way the statistics are computed. The denominator of the t-statistic you are computing is the sum of the standard error of the numerator (s_i) plus a small constant (s_0). This small constant s_0 is computed using all the s_i values (to find out more about this see either the original Tusher paper or Holger's thesis). As Holger noted, when you add a fold change criterion, the genes are filtered _before_ you do any of these computations. Thus, you will have fewer genes when you use the larger fold change criterion. Since the s_0 value is computed using the s_i values from the available genes, the denominator of your statistic is probably different in the two cases (because the s_0 value is likely to be different). So it is not surprising that the number of genes found significant will change as well. Best, Jim > > On Dec 12, 2007 5:15 PM, Holger Schwender <holger.schw at="" gmx.de=""> wrote: > >> Hi John, >> >> I am not sure, but this might be due to the fact that in siggenes the fold >> change is used to filter out genes prior to the actual SAM analysis. Thus, >> only the permuted values of the test statistics for the remaining genes, >> i.e. genes with a fold change larger than R.fold (or smaller than >> 1/R.fold), are used to estimate the null distribution and to compute d.bar, >> i.e. the values of the test statistic expected under the null, instead of >> using the permuted values of all genes. This might lead to these strange >> results. >> >> Best, >> Holger >> >> >> >> >> -------- Original-Nachricht -------- >>> Datum: Tue, 11 Dec 2007 19:20:25 +0100 >>> Von: "John Lande" <john.lande77 at="" gmail.com=""> >>> An: bioconductor at stat.math.ethz.ch >>> Betreff: [BioC] siggenes fc threshold >>> dear biocoductors, >>> >>> I want to use siggenes, and sam to find differentially regulated genes, >>> but >>> I have problems with siggenes function, and possibly didn't understand >>> properly something. >>> here I will report an example that emulate the problem: >>> >>> library(siggenes) >>> data(golub) >>> sam.out1 <- sam(golub, golub.cl, rand = 123, gene.names = >>> golub.gnames[,3], >>> med=TRUE, lambda=.5,method=d.stat, B=5, R.fold=1, delta=seq(0.01, 3, 0.5 >> )) >>> sam.out2 <- sam(golub, golub.cl, rand = 123, gene.names = >>> golub.gnames[,3], >>> med=TRUE, lambda=.5,method=d.stat, B=5, R.fold=2, delta=seq(0.01, 3, 0.5 >> )) >>> I use the parameter R.fold to set the minimum FC I want for my list of >>> significant genes. >>> the problem is this: when I launch >>> >>>> sam.out1 >>> SAM Analysis for the Two-Class Unpaired Case Assuming Unequal Variances >>> >>> Delta p0 False Called FDR >>> 1 0.01 0.519 2950 3007 0.50933 >>> 2 0.51 0.519 478 1638 0.15151 >>> 3 1.01 0.519 38 839 0.02351 >>> 4 1.51 0.519 1 380 0.00137 >>> 5 2.01 0.519 0 159 0 >>> 6 2.51 0.519 0 74 0 >>> >>>> sam.out2 >>> SAM Analysis for the Two-Class Unpaired Case Assuming Unequal Variances >>> >>> Delta p0 False Called FDR >>> 1 0.01 0 17 166 0 >>> 2 0.51 0 17 166 0 >>> 3 1.01 0 12 164 0 >>> 4 1.51 0 3 163 0 >>> 5 2.01 0 1 161 0 >>> 6 2.51 0 0 155 0 >>> >>> you can see that the sam with higher FC with a delta of 2.51 has an >> higher >>> number of significant genes than the one with 1. to me does not make >> much >>> sense. >>> by the way I also tried to use sam in excel and I don't have the same >>> problems. furthermore the dynamic range of delta is much lower. do you >>> have >>> any idea? >>> >>> what do I do wrong? >>> >>> best regards >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> -- >> Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! >> Ideal f?r Modem und ISDN: http://www.gmx.net/de/go/smartsurfer >> > > [[alternative HTML version deleted]] > > > > -------------------------------------------------------------------- ---- > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD REPLY

Login before adding your answer.

Traffic: 537 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6