I have normalized data that I am trying to filter with HTSFilter. Here is what I tried to run:
filter <- HTSFilter(x = nc, conds = conds, normalization = "none")
nc
is a 14543 x 16 dataframe containing genes as row names and library names as column names. Each column of the dataframe is a numeric vector.
conds
is a character vector of length 16 that is identical to the names of the column in the data frame. Some names are duplicated which indicates replicate libraries.
The error I get upon running is:
Error in 1:(nbindiv) : argument of length 0
I looked at the Sultan data in the vignette and couldn't really see how my inputs to the function differed other than the fact my values are continuous and the Sultan data is discrete. The function runs fine on the Sultan data.
Looking through the code on GitHub it looks like somehow the nbindiv
variable gets assigned a numeric(0)
somewhere but I couldn't quite figure out why this is happening.
Session info:
R version 3.3.1 (2016-06-21) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 16.04.1 LTS locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods [9] base other attached packages: [1] DESeq2_1.12.4 SummarizedExperiment_1.2.3 GenomicRanges_1.24.3 [4] GenomeInfoDb_1.8.7 magrittr_1.5 HTSFilter_1.12.0 [7] RDAVIDWebService_1.10.0 GOstats_2.38.1 Category_2.38.0 [10] Matrix_1.2-7.1 AnnotationDbi_1.34.4 IRanges_2.6.1 [13] S4Vectors_0.10.3 Biobase_2.32.0 graph_1.50.0 [16] BiocGenerics_0.18.0 ggplot2_2.1.0 loaded via a namespace (and not attached): [1] Rcpp_0.12.7 XVector_0.12.1 RColorBrewer_1.1-2 [4] plyr_1.8.4 zlibbioc_1.18.0 bitops_1.0-6 [7] tools_3.3.1 rpart_4.1-10 annotate_1.50.1 [10] RSQLite_1.0.0 gtable_0.2.0 lattice_0.20-34 [13] DBI_0.5-1 gridExtra_2.2.1 rJava_0.9-8 [16] DESeq_1.24.0 cluster_2.0.5 genefilter_1.54.2 [19] locfit_1.5-9.1 nnet_7.3-12 grid_3.3.1 [22] data.table_1.9.6 GSEABase_1.34.1 BiocParallel_1.6.6 [25] XML_3.98-1.4 RBGL_1.48.1 survival_2.39-5 [28] foreign_0.8-67 latticeExtra_0.6-28 Formula_1.2-1 [31] limma_3.28.21 GO.db_3.3.0 geneplotter_1.50.0 [34] edgeR_3.14.0 Hmisc_3.17-4 scales_0.4.0 [37] splines_3.3.1 AnnotationForge_1.14.2 xtable_1.8-2 [40] colorspace_1.2-6 acepack_1.3-3.3 RCurl_1.95-4.8 [43] munsell_0.4.3 chron_2.3-47
Hi and thanks for your question. I think I'll need some additional information, as I'm not currently able to replicate your error. In particular, could you send me a minimal reproducible example that throws the same error that you're seeing so that I can work off that? Also, what version of HTSFilter are you currently using? The
nbindiv
argument is just defined as the number of columns in the (log normalized) data within the (unexported).perConditionSimilaryIndex
function, so it is indeed a weird error message.I am using HTSFilter 1.12.0(see edit below). I will update my post with the session info. Here is a modified version of the data that still throws the error on my system: http://pastebin.com/p2ZXAZVYUsing that data, I get the error when I run the following:
filter <- HTSFilter(x = dat, conds = names(dat), normalization = "none")
Edit: I just upgraded Bioconductor and I am now running HTSFilter 1.14.0. The error still occurs.
I figured out the issue. One of my libraries only has one replicate. When I remove this column from the data it is able to proceed. So I guess the question is, if I really need that library in my expression matrix what can I do?