warning message in KS test
1
0
Entering edit mode
@james-anderson-1641
Last seen 9.6 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20071018/ 4dee2314/attachment.pl
• 2.4k views
ADD COMMENT
0
Entering edit mode
Francois Pepin ★ 1.3k
@francois-pepin-1012
Last seen 9.6 years ago
Hi James, Please note that there is nothing bioconductor specific in this question. You might want to use the R-help list instead next time. You are more likely to get a useful answer from there and it reduces the clutter on this one. This being said, a warning is not an error, you should still get a result. The warning simply means that the KS test will not give an exact pvalue if you have identical values in one (or both) of your distribution. You still get an estimate that I've found to be usable when I've used it. Francois On Thu, 2007-10-18 at 15:37 -0700, James Anderson wrote: > Hi, > > I am trying to use ks test to do gene selection for microarray data. > Suppose A is my matrix with 5000 genes and 60 samples, the first 30 samples are in class 0, the other 30 samples are in class 1. > > I use the following command to calculate the pvalue for each gene. > > index_class0 = which(ClassLabel == 0) > index_Class1 = which(ClassLabel == 1) > PvalsAll <- apply(A, 1, function(x) ks.test(x[index_class0], x[index_class1], > exact = NULL)$p.value) > > I got some warning message like: > > In ks.test(x[index_class0], x[index_class1], exact = NULL) : > cannot compute correct p-values with ties > > but with the same data, I can get some values using matlab statistics toolbox, does anyone have similar experience? How to fix this problem? > > Many thanks, > > James > > __________________________________________________ > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20071018/ 442c706f/attachment.pl
ADD REPLY
0
Entering edit mode
Hi James, I did not say that your question shouldn't be asked. It is valid and I'd sure want to know if it happened to me. I simply meant that the R-help list could a be more appropriate place to ask. I doubt the warning is linked to your getting NAs (which you hadn't mentioned in your previous e-mail). That warning clear well documented in the help page for ks.test. I don't know why you are getting NAs as your answers. I cannot reproduce it with the most recent R version, even if I include NA or NULL values. If you post the exact values (or a small reproducible example) for which this happens, we are more likely to be able to help. Also, please include the output from sessionInfo(). Francois On Thu, 2007-10-18 at 18:52 -0700, James Anderson wrote: > Francois, > > I would not post on Bioconductor list if I can still get some answer > with the warning, I found a bunch of p values are shown as NA, but I > can get values out using matlab, so I am wondering what is the EXACT > reason why it shows as NA. > I am sorry for being off-topic on Bioconductor, it would be great if > you can tell me the R help email list. I just don't know this, right > now Bioconductor list is the only list I am seeking help, thanks. > > James > > Francois Pepin <fpepin at="" cs.mcgill.ca=""> wrote: > Hi James, > > Please note that there is nothing bioconductor specific in > this > question. You might want to use the R-help list instead next > time. You > are more likely to get a useful answer from there and it > reduces the > clutter on this one. > > This being said, a warning is not an error, you should still > get a > result. The warning simply means that the KS test will not > give an exact > pvalue if you have identical values in one (or both) of your > distribution. You still get an estimate that I've found to be > usable > when I've used it. > > Francois > > On Thu, 2007-10-18 at 15:37 -0700, James Anderson wrote: > > Hi, > > > > I am trying to use ks test to do gene selection for > microarray data. > > Suppose A is my matrix with 5000 genes and 60 samples, the > first 30 samples are in class 0, the other 30 samples are in > class 1. > > > > I use the following command to calculate the pvalue for each > gene. > > > > index_class0 = which(ClassLabel == 0) > > index_Class1 = which(ClassLabel == 1) > > PvalsAll <- apply(A, 1, function(x) ks.test(x[index_class0], > x[index_class1], > > exact = NULL)$p.value) > > > > I got some warning message like: > > > > In ks.test(x[index_class0], x[index_class1], exact = NULL) : > > cannot compute correct p-values with ties > > > > but with the same data, I can get some values using matlab > statistics toolbox, does anyone have similar experience? How > to fix this problem? > > > > Many thanks, > > > > James > > > > __________________________________________________ > > > > > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com >
ADD REPLY
0
Entering edit mode
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20071019/ 5d0e1be2/attachment.pl
ADD REPLY
0
Entering edit mode
Can you find just ONE instance (i.e. one gene) that shows this behaviour and email us the dput() on that vector please. It is very hard to understand what you have shown us without an example. Thank you. Regards, Adai James Anderson wrote: > Francois, > > What I wanted to do is to select top 20 genes from a microarray data using kstest, I have tried t.test and wilcox.test, they both work perfectly fine. When using kstest, it output the first 6 genes without problem, but the other 14 genes are output with NA. I am using the most recent version of R (2.6.0 10/03/07), below is the output, those number are the gene indices and their corresponding p values. > > Thanks, > > james > > 493 > 1772 > 249 > 1671 > 245 > 780 > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > 1.053130e-06 > 2.615535e-06 > 4.426209e-06 > 5.57936e-06 > 1.280431e-05 > 1.874259e-05 > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA > NA >
ADD REPLY
0
Entering edit mode
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20071019/ 1f4a508b/attachment.pl
ADD REPLY

Login before adding your answer.

Traffic: 817 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6