Question: multtest problem...
0
gravatar for Dipl.-Ing. Johannes Rainer
14.4 years ago by
hi, i used the mt.teststat function from the package multtest and wonder why the p values that the function calculates differ from those i get using the wilcox.test or t.test function. i tried it with the golub data set (data(golub)) and wilcox.test(x=golub[,golub.cl==0],y=golub[,golub.cl==1]) returns a p-value = 0.8987 while teststat<-mt.teststat(golub,golub.cl,test="wilcoxon") teststat[1] = 1.75419 i thought that the mt.teststat funciton uses simple t tests or wilcox test to calculate p values... perhaps i am wrong with the class labels? from the help i understood, that if i have two groups in my samples i submit as classlabels for examples c(0,0,0,1,1,1) and this means that the first 3 columns of my data correspond to group 0 and the other 3 to group 1. is this correct? thanks, jo
multtest • 824 views
ADD COMMENTlink modified 14.4 years ago by kpollard110 • written 14.4 years ago by Dipl.-Ing. Johannes Rainer430
Answer: multtest problem...
0
gravatar for Sean Davis
14.4 years ago by
Sean Davis21k
United States
Sean Davis21k wrote:
On Apr 29, 2005, at 5:58 AM, Dipl.-Ing. Johannes Rainer wrote: > hi, > > i used the mt.teststat function from the package multtest and wonder > why the p values that the function calculates differ from those i get > using the wilcox.test or t.test function. > i tried it with the golub data set (data(golub)) and > wilcox.test(x=golub[,golub.cl==0],y=golub[,golub.cl==1]) > > returns a p-value = 0.8987 > > while teststat<-mt.teststat(golub,golub.cl,test="wilcoxon") > > teststat[1] = 1.75419 > > i thought that the mt.teststat funciton uses simple t tests or wilcox > test to calculate p values... > Looking at the help for mt.teststat (by typing help(mt.teststat)), you can see part way down: Value: For 'mt.teststat', a vector of test statistics for each row (gene). The test statistic is NOT a p-value. You have to convert it to a p-value using some other function (usually something like pnorm, pt, etc). Finally, you will need to apply mt.rawp2adjp to get corrected p-values. I would suggest reading the mt.teststat vignette for more information and worked examples, including how to calculate p-values from mt.teststat output. Hope this helps, Sean
ADD COMMENTlink written 14.4 years ago by Sean Davis21k
:) thanks for the help Quoting Sean Davis <sdavis2@mail.nih.gov>: > > On Apr 29, 2005, at 5:58 AM, Dipl.-Ing. Johannes Rainer wrote: > >> hi, >> >> i used the mt.teststat function from the package multtest and wonder >> why the p values that the function calculates differ from those i >> get using the wilcox.test or t.test function. >> i tried it with the golub data set (data(golub)) and >> wilcox.test(x=golub[,golub.cl==0],y=golub[,golub.cl==1]) >> >> returns a p-value = 0.8987 >> >> while teststat<-mt.teststat(golub,golub.cl,test="wilcoxon") >> >> teststat[1] = 1.75419 >> >> i thought that the mt.teststat funciton uses simple t tests or >> wilcox test to calculate p values... >> > > Looking at the help for mt.teststat (by typing help(mt.teststat)), > you can see part way down: > > Value: > > For 'mt.teststat', a vector of test statistics for each row > (gene). > > The test statistic is NOT a p-value. You have to convert it to a > p-value using some other function (usually something like pnorm, pt, > etc). Finally, you will need to apply mt.rawp2adjp to get corrected > p-values. I would suggest reading the mt.teststat vignette for more > information and worked examples, including how to calculate p-values > from mt.teststat output. > > Hope this helps, > Sean > >
ADD REPLYlink written 14.4 years ago by Dipl.-Ing. Johannes Rainer430
Answer: multtest problem...
0
gravatar for Wolfgang Huber
14.4 years ago by
EMBL European Molecular Biology Laboratory
Wolfgang Huber13k wrote:
Hi Jo, your mail is not very clear about the distinction between test statistics and p-value, and perhaps this is where your confusion comes from? There is no reason why the two numbers you state in your mail should be equal, they are different things. Have a look as well at the function rowttests in the develop version of "genefilter", for a fast and userfriendly approach to doing t-tests for each row of an exprSet. Best Wolfgang Dipl.-Ing. Johannes Rainer wrote: > hi, > > i used the mt.teststat function from the package multtest and wonder why > the p values that the function calculates differ from those i get using > the wilcox.test or t.test function. > i tried it with the golub data set (data(golub)) and > wilcox.test(x=golub[,golub.cl==0],y=golub[,golub.cl==1]) > > returns a p-value = 0.8987 > > while teststat<-mt.teststat(golub,golub.cl,test="wilcoxon") > > teststat[1] = 1.75419 > > i thought that the mt.teststat funciton uses simple t tests or wilcox > test to calculate p values... > perhaps i am wrong with the class labels? > > from the help i understood, that if i have two groups in my samples i > submit as classlabels for examples c(0,0,0,1,1,1) and this means that > the first 3 columns of my data correspond to group 0 and the other 3 to > group 1. is this correct? > > thanks, jo > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor -- Best regards Wolfgang ------------------------------------- Wolfgang Huber European Bioinformatics Institute European Molecular Biology Laboratory Cambridge CB10 1SD England Phone: +44 1223 494642 Fax: +44 1223 494486 Http: www.ebi.ac.uk/huber
ADD COMMENTlink written 14.4 years ago by Wolfgang Huber13k
Hi. I'm having problem with R and gcrma, i.e. using gcrma (12 arrays, hgu133a) R keep exiting with segmentation fault. Any hint? Info: Debian platform i386-pc-linux-gnu arch i386 os linux-gnu system i386, linux-gnu status major 2 minor 1.0 year 2005 month 04 day 18 language R Bioconductor devel
ADD REPLYlink written 14.4 years ago by stecalza@tiscali.it290
Hi, Could you give us more information about the version of gcrma you used? You can use sessionInfo() to get the information. Moreover, if the 12 arrays are available online, could you also provide the link for us to download the arrays and test it? I try to run 4 hgu133a arrays in gcrma, and there is no error occured. Here is what I did: #################################################################### > library(gcrma) Loading required package: affy Loading required package: Biobase Loading required package: tools Welcome to Bioconductor Vignettes contain introductory material. To view, simply type: openVignette() For details on reading vignettes, see the openVignette help page. Loading required package: reposTools Loading required package: matchprobes > data <- ReadAffy() > eset <- gcrma(data) > data AffyBatch object size of arrays=712x712 features (15851 kb) cdf=HG-U133A (22283 affyids) number of samples=4 number of genes=22283 annotation=hgu133a > eset Expression Set (exprSet) with 22283 genes 4 samples phenoData object with 1 variables and 4 cases varLabels sample: arbitrary numbering > sessionInfo() R version 2.1.0, 2005-04-20, x86_64-unknown-linux-gnu attached base packages: [1] "splines" "tools" "methods" "stats" "graphics" "grDevices" [7] "utils" "datasets" "base" other attached packages: hgu133aprobe hgu133acdf gcrma matchprobes affy reposTools "1.0" "1.4.3" "1.1.3" "1.0.12" "1.5.8-1" "1.5.2" Biobase "1.5.0" #################################################################### HTH, Ting-Yuan On Fri, 29 Apr 2005, Stefano Calza wrote: > Hi. > > I'm having problem with R and gcrma, i.e. using gcrma (12 arrays, hgu133a) R keep exiting with segmentation fault. > > Any hint? > > Info: > > Debian > platform i386-pc-linux-gnu > arch i386 > os linux-gnu > system i386, linux-gnu > status > major 2 > minor 1.0 > year 2005 > month 04 > day 18 > language R > > Bioconductor devel > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >
ADD REPLYlink written 14.4 years ago by Ting-Yuan Liu FHCRC780
Dear bioconductor mailing list, I have some pairs of genes and I would like to measure their "distances" on the GO molecular function graph. So I wrote this lines of code: distances<-numeric(100) for (i in 1:100){ goanal<-simLL(as.character(hmapr$V1[i]),as.character(hmapr$V2[i]),"MF ") print(goanal$sim) distances[i]<-goanal$sim rm(goanal) } where hmapr$V1 and hmapr$V2 are the vectors in which I have the locus link identifiers of my genes. When I run the script, in a few time I obtain "Error: invalid subscript type". I noticed that there are problems with the pair of llID where the script stops. Is there a way to bypass the problem, to jump to the next pair of genes after putting NaN in the distances vector? Thanks a lot, Maria Maria Persico MINT database, Cesareni Group Universita' di Tor Vergata, via della Ricerca Scientifica 00133 Roma, Italy Tel: +39 0672594315 FAX: +39 0672594766 e-mail: maria@cbm.bio.uniroma2.it
ADD REPLYlink written 14.4 years ago by Maria Persico100
Hi, As Seth said you can use try (or some of the other features of the exception handling system). Would it be possible to supply me with an example where it fails - it might be better to fix the code so that it handles these cases more gracefully, thanks, Robert On Apr 30, 2005, at 4:14 AM, Maria Persico wrote: > Dear bioconductor mailing list, > > I have some pairs of genes and I would like to measure their > "distances" > on the GO molecular function graph. > > So I wrote this lines of code: > > distances<-numeric(100) > for (i in 1:100){ > > goanal<- > simLL(as.character(hmapr$V1[i]),as.character(hmapr$V2[i]),"MF") > print(goanal$sim) > distances[i]<-goanal$sim > rm(goanal) > } > > where hmapr$V1 and hmapr$V2 are the vectors in which I have the locus > link > identifiers of my genes. > > When I run the script, in a few time I obtain > "Error: invalid subscript type". > > I noticed that there are problems with the pair of llID where the > script > stops. > > Is there a way to bypass the problem, to jump to the next pair > of genes after putting NaN in the distances vector? > > Thanks a lot, > > Maria > > > > > > Maria Persico > MINT database, Cesareni Group > Universita' di Tor Vergata, via della Ricerca Scientifica > 00133 Roma, Italy > Tel: +39 0672594315 > FAX: +39 0672594766 > e-mail: maria@cbm.bio.uniroma2.it > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > > +--------------------------------------------------------------------- -- ----------------+ | Robert Gentleman phone: (206) 667-7700 | | Head, Program in Computational Biology fax: (206) 667-1319 | | Division of Public Health Sciences office: M2-B865 | | Fred Hutchinson Cancer Research Center | | email: rgentlem@fhcrc.org | +--------------------------------------------------------------------- -- ----------------+
ADD REPLYlink written 14.4 years ago by rgentleman5.5k
Dear Seth, dear Robert, thanks for your answers. I have still problems because I couldn't reinitialize goanal(sorry, I'm a beginner): indeed.... > goanal [1] "Error: invalid subscript type\n" attr(,"class") [1] "try-error" > str(goanal) Class 'try-error' chr "Error: invalid subscript type " this is the code that doesn't work: > distances<-numeric(100) > for (i in 1:100){ + goanal <- try(simLL(as.character(hmapr$V1[i]),as.character(hmapr$V2[i]),"MF")) + if (inherits("gonal", "try-error")) { + goanal <- as.list() + #goanal[["sim"]] <- NA + #print(goanal$sim) + distances[i]<-NA#goanal$sim + + } + else{ + goanal<-simLL(as.character(hmapr$V1[i]),as.character(hmapr$V2[i])," MF") + print(goanal$sim) + distances[i]<-goanal$sim + + } + + } Some example where the function fails: (I'm working with human genes) The last pair for which I have a distance is simLL("2935","8894") The pair where simLL fails: > simLL("10554","10486") Error: invalid subscript type another examples of failures: > simLL("81608","55339") Error: invalid subscript type > simLL("50999","10972") Error: invalid subscript type > simLL("50999","10972") Error: invalid subscript type Thanks, Maria Maria Persico MINT database, Cesareni Group Universita' di Tor Vergata, via della Ricerca Scientifica 00133 Roma, Italy Tel: +39 0672594315 FAX: +39 0672594766 e-mail: maria@cbm.bio.uniroma2.it On Sat, 30 Apr 2005, Robert Gentleman wrote: > Hi, > As Seth said you can use try (or some of the other features of the > exception handling system). > Would it be possible to supply me with an example where it fails - it > might be better to fix the code so that it handles these cases more > gracefully, > > thanks, > Robert > > On Apr 30, 2005, at 4:14 AM, Maria Persico wrote: > > > Dear bioconductor mailing list, > > > > I have some pairs of genes and I would like to measure their > > "distances" > > on the GO molecular function graph. > > > > So I wrote this lines of code: > > > > distances<-numeric(100) > > for (i in 1:100){ > > > > goanal<- > > simLL(as.character(hmapr$V1[i]),as.character(hmapr$V2[i]),"MF") > > print(goanal$sim) > > distances[i]<-goanal$sim > > rm(goanal) > > } > > > > where hmapr$V1 and hmapr$V2 are the vectors in which I have the locus > > link > > identifiers of my genes. > > > > When I run the script, in a few time I obtain > > "Error: invalid subscript type". > > > > I noticed that there are problems with the pair of llID where the > > script > > stops. > > > > Is there a way to bypass the problem, to jump to the next pair > > of genes after putting NaN in the distances vector? > > > > Thanks a lot, > > > > Maria > > > > > > > > > > > > Maria Persico > > MINT database, Cesareni Group > > Universita' di Tor Vergata, via della Ricerca Scientifica > > 00133 Roma, Italy > > Tel: +39 0672594315 > > FAX: +39 0672594766 > > e-mail: maria@cbm.bio.uniroma2.it > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > > > +------------------------------------------------------------------- ---- > ----------------+ > | Robert Gentleman phone: (206) 667-7700 > | > | Head, Program in Computational Biology fax: (206) 667-1319 | > | Division of Public Health Sciences office: M2-B865 > | > | Fred Hutchinson Cancer Research Center > | > | email: rgentlem@fhcrc.org > | > +------------------------------------------------------------------- ---- > ----------------+ > >
ADD REPLYlink written 14.4 years ago by Maria Persico100
Answer: multtest problem...
0
gravatar for kpollard
14.4 years ago by
kpollard110
United States
kpollard110 wrote:
Jo, The statistic is the same between multtest and t.test (for t-tests) or wilcox.test (for wilcoxen tests), but the p-value is computed differently. The functions in t.test and wilcox.test use tabled distributions to get a p-value for each test. These assume a certain distribution for the test statistics and also treat each test independently. The multtest package provides methods that use the bootstrap (MTP function) or permutations to estimate the joint null distribution of the test statistics and then compute p-values from these. These are empirical null distributions which differ from those used in t.test and wilcox.test in two important ways: (i) they do not assume a particular parametric form for the test statistics distribution, and (ii) they account for correlations between test statistics by using the *joint* null distribution. For these reasons, the p-values computed can be very different from those using tabled, marginal distributions. Hope this helps, Katie > ------------------------------ > > Message: 18 > Date: Fri, 29 Apr 2005 11:58:26 +0200 > From: "Dipl.-Ing. Johannes Rainer" <johannes.rainer@tugraz.at> > Subject: [BioC] multtest problem... > To: "bioconductor@stat.math.ethz.ch" <bioconductor@stat.math.ethz.ch> > Message-ID: <20050429115826.v1sqjl58kk80c88g@webmail.tugraz.at> > Content-Type: text/plain; charset=ISO-8859-1; format="flowed" > > hi, > > i used the mt.teststat function from the package multtest and wonder > why the p values that the function calculates differ from those i get > using the wilcox.test or t.test function. > i tried it with the golub data set (data(golub)) and > wilcox.test(x=golub[,golub.cl==0],y=golub[,golub.cl==1]) > > returns a p-value = 0.8987 > > while teststat<-mt.teststat(golub,golub.cl,test="wilcoxon") > > teststat[1] = 1.75419 > > i thought that the mt.teststat funciton uses simple t tests or wilcox > test to calculate p values... > perhaps i am wrong with the class labels? > > from the help i understood, that if i have two groups in my samples i > submit as classlabels for examples c(0,0,0,1,1,1) and this means that > the first 3 columns of my data correspond to group 0 and the other 3 to > group 1. is this correct? > > thanks, jo
ADD COMMENTlink written 14.4 years ago by kpollard110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 315 users visited in the last hour