Hi, I am trying to using SAMseq() to analyze my RNA-seq experiment (20000 genes x 550 samples) with survival endpoint. It quickly give the following error:
> library(samr)
Loading required package: impute
Loading required package: matrixStats
Attaching package: ‘matrixStats’
The following objects are masked from ‘package:Biobase’:
anyMissing, rowMedians
Warning messages:
1: package ‘samr’ was built under R version 3.3.3
2: package ‘matrixStats’ was built under R version 3.3.3
> samfit<-SAMseq(data, PFI.time,censoring.status=PFI.status, resp.type="Survival")
Estimating sequencing depths...
Error in quantile.default(prop, c(0.25, 0.75)) :
missing values and NaN's not allowed if 'na.rm' is FALSE
In addition: Warning message:
In sum(x) : integer overflow - use sum(as.numeric(.))
Error during wrapup: cannot open the connection
I checked, my data matrix and y variables have no missing values. Anyone has suggestions what's going on?
Thank you!
John
Here is the info from
> biocValid()
[1] TRUE
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] samr_2.0 matrixStats_0.52.2 impute_1.48.0 BiocInstaller_1.24.0 rcom_3.1-3 rscproxy_2.1-1
loaded via a namespace (and not attached):
[1] tools_3.3.2
Hello array chip!
It appears that your post has been cross-posted to another site: Asked elsewhere https://stat.ethz.ch/pipermail/r-help/2017-November/450287.html
This is typically not recommended as it runs the risk of annoying people in both communities.
sorry that I realized that samr package is in the R community, instead of bioconductor, so I posted it there again. Sorry if this annoys anyone. But would really appreciate if anyone responds if knowing what's going on with the error.
someone suggest this question is better answered on bioconductor, instead of in R community. so here is more info from what I have been trying:
I did not do any pre-processing (normalization/transformation) and used raw counts in the SAMseq(). I tried to reduce the size of input data. I found out that SAMseq runs fine with the first 1457 genes, but ran into the problem when 1458 genes are used. But I found no problem with raw counts from the 1458th gene. And when I run SAMseq() on the next 1268 genes separately, it works fine, too. So it's not that the 1458th gene has anything wrong! SAMseq() ran into problem again when I ran the next 1269 genes separately.
At this point, I think it is something internal with SAMseq that prevent its working on large matrix input data. Just noticed that its last release is in 2011.