Hi,
I have been using the pre-ranked cameraPR function for a custom ranking of genes for camera.
If I use the function as is, I get no hit genes at all:
cameraPR(statistic=the_stat,index=Hs.c5.gene_name)[1:3,] NGenes Direction PValue FDR GO_REGULATION_OF_DOPAMINE_METABOLIC_PROCESS 0 Up NaN NaN GO_LACTATE_TRANSPORT 0 Up NaN NaN GO_POSITIVE_REGULATION_OF_VIRAL_TRANSCRIPTION 0 Up NaN NaN
I think this might be due to gene names of the statistic getting stripped away inside the function by as.numeric(). Simple fix - If I make the following change to the cameraPR.default it appears to work:
# Check statistic
if(anyNA(statistic)) stop("NA values for statistic now allowed")
Stat <- as.numeric(statistic)
G <- length(Stat)
# ID <- names(Stat) # Replace Stat with statistic which still has names
ID <- names(statistic)
if(G<3) stop("Two few genes in dataset: need at least 3")
Output after change:
cameraPR.default.edit(statistic=the_stat,index=Hs.c5.gene_name)[1:3,] NGenes Direction PValue FDR GO_STEROL_BIOSYNTHETIC_PROCESS 32 Up 2.850148e-18 1.754266e-14 GO_MICROGLIAL_CELL_ACTIVATION 4 Up 3.279112e-13 1.009147e-09 GO_CYTOSOLIC_RIBOSOME 96 Up 8.743939e-12 1.793965e-08
So I think it is a bug, but if it is a case of user error please let me know! I've just redefined a function locally as a workaround.
Also, please advise if there is somewhere else I should be reporting this - primary repo or bug tracker is ? (SVN? github? here?)
Thanks,
Sarah.
-----------------------------------------------------------------------------------------------------------
# Input format is like:
#(NB: Ignore the 0s, I'm still figuring out the statistic)
the_stat[1:10] IGFBP3 RERG STMN4 IGSF21 LINC00890 CDH11 GNG3 INSIG1 JUN SGK1 0.000000 0.000000 9.443428 9.031877 0.000000 0.000000 8.835153 8.420033 8.168065 0.000000
head(Hs.c5.gene_name) $GO_REGULATION_OF_DOPAMINE_METABOLIC_PROCESS [1] "ABAT" "CHRNB2" "COMT" "DRD1" "DRD4" "GPR37" "HPRT1" "HTR1A" "MAOB" "NR4A2" "PARK2" "PARK7" [13] "PDE1B" "PNKD" "SLC6A3" "SNCA" "TACR3" $GO_LACTATE_TRANSPORT [1] "EMB" "SLC16A1" "SLC16A11" "SLC16A12" "SLC16A13" "SLC16A3" "SLC16A4" "SLC16A5" "SLC16A6" "SLC16A7" [11] "SLC16A8" "SLC5A12"
> sessionInfo() R version 3.4.1 (2017-06-30) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 16.04.2 LTS Matrix products: default BLAS: /mnt/software/apps/R/3.4.1/lib/R/lib/libRblas.so LAPACK: /mnt/software/apps/R/3.4.1/lib/R/lib/libRlapack.so locale: [1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8 [5] LC_MONETARY=en_AU.UTF-8 LC_MESSAGES=en_AU.UTF-8 LC_PAPER=en_AU.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] gplots_3.0.1 reshape2_1.4.2 EnsDb.Hsapiens.v79_2.1.0 BiocInstaller_1.26.1 [5] EnsDb.Mmusculus.v79_2.1.0 ensembldb_2.0.4 AnnotationFilter_1.0.0 GenomicFeatures_1.28.4 [9] AnnotationDbi_1.38.1 Biobase_2.36.2 GenomicRanges_1.28.3 GenomeInfoDb_1.12.2 [13] IRanges_2.10.2 S4Vectors_0.14.3 BiocGenerics_0.22.0 edgeR_3.18.1 [17] limma_3.32.5 loaded via a namespace (and not attached): [1] SummarizedExperiment_1.6.3 gtools_3.5.0 locfit_1.5-9.1 [4] lattice_0.20-35 htmltools_0.3.6 rtracklayer_1.36.4 [7] yaml_2.1.14 interactiveDisplayBase_1.14.0 blob_1.1.0 [10] XML_3.98-1.9 rlang_0.1.1 DBI_0.7 [13] BiocParallel_1.10.1 bit64_0.9-7 plyr_1.8.4 [16] matrixStats_0.52.2 GenomeInfoDbData_0.99.0 stringr_1.2.0 [19] zlibbioc_1.22.0 ProtGenerics_1.8.0 Biostrings_2.44.2 [22] caTools_1.17.1 memoise_1.1.0 biomaRt_2.32.1 [25] httpuv_1.3.5 curl_2.7 Rcpp_0.12.11 [28] KernSmooth_2.23-15 xtable_1.8-2 gdata_2.18.0 [31] DelayedArray_0.2.7 XVector_0.16.0 mime_0.5 [34] bit_1.1-12 Rsamtools_1.28.0 AnnotationHub_2.8.2 [37] digest_0.6.12 stringi_1.1.5 shiny_1.0.3 [40] grid_3.4.1 tools_3.4.1 bitops_1.0-6 [43] magrittr_1.5 RCurl_1.95-4.8 lazyeval_0.2.0 [46] tibble_1.3.3 RSQLite_2.0 pkgconfig_2.0.1 [49] Matrix_1.2-10 httr_1.2.1 R6_2.2.2 [52] GenomicAlignments_1.12.1 compiler_3.4.1