Question: Error when trying to run HTSFilter
0
3.0 years ago by
snamjoshi8730
snamjoshi8730 wrote:

I have normalized data that I am trying to filter with HTSFilter. Here is what I tried to run:

filter <- HTSFilter(x = nc, conds = conds, normalization = "none")

nc is a 14543 x 16 dataframe containing genes as row names and library names as column names. Each column of the dataframe is a numeric vector.

conds is a character vector of length 16 that is identical to the names of the column in the data frame. Some names are duplicated which indicates replicate libraries.

The error I get upon running is:

Error in 1:(nbindiv) : argument of length 0

I looked at the Sultan data in the vignette and couldn't really see how my inputs to the function differed other than the fact my values are continuous and the Sultan data is discrete. The function runs fine on the Sultan data.

Looking through the code on GitHub it looks like somehow the nbindiv variable gets assigned a numeric(0) somewhere but I couldn't quite figure out why this is happening.

Session info:

R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
[1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods
[9] base

other attached packages:
[1] DESeq2_1.12.4              SummarizedExperiment_1.2.3 GenomicRanges_1.24.3
[4] GenomeInfoDb_1.8.7         magrittr_1.5               HTSFilter_1.12.0
[7] RDAVIDWebService_1.10.0    GOstats_2.38.1             Category_2.38.0
[10] Matrix_1.2-7.1             AnnotationDbi_1.34.4       IRanges_2.6.1
[13] S4Vectors_0.10.3           Biobase_2.32.0             graph_1.50.0
[16] BiocGenerics_0.18.0        ggplot2_2.1.0

loaded via a namespace (and not attached):
[1] Rcpp_0.12.7            XVector_0.12.1         RColorBrewer_1.1-2
[4] plyr_1.8.4             zlibbioc_1.18.0        bitops_1.0-6
[7] tools_3.3.1            rpart_4.1-10           annotate_1.50.1
[10] RSQLite_1.0.0          gtable_0.2.0           lattice_0.20-34
[13] DBI_0.5-1              gridExtra_2.2.1        rJava_0.9-8
[16] DESeq_1.24.0           cluster_2.0.5          genefilter_1.54.2
[19] locfit_1.5-9.1         nnet_7.3-12            grid_3.3.1
[22] data.table_1.9.6       GSEABase_1.34.1        BiocParallel_1.6.6
[25] XML_3.98-1.4           RBGL_1.48.1            survival_2.39-5
[28] foreign_0.8-67         latticeExtra_0.6-28    Formula_1.2-1
[31] limma_3.28.21          GO.db_3.3.0            geneplotter_1.50.0
[34] edgeR_3.14.0           Hmisc_3.17-4           scales_0.4.0
[37] splines_3.3.1          AnnotationForge_1.14.2 xtable_1.8-2
[40] colorspace_1.2-6       acepack_1.3-3.3        RCurl_1.95-4.8
[43] munsell_0.4.3          chron_2.3-47         

htsfilter • 647 views
modified 3.0 years ago by andrea.rau60 • written 3.0 years ago by snamjoshi8730

Hi and thanks for your question. I think I'll need some additional information, as I'm not currently able to replicate your error. In particular, could you send me a minimal reproducible example that throws the same error that you're seeing so that I can work off that? Also, what version of HTSFilter are you currently using? The nbindiv argument is just defined as the number of columns in the (log normalized) data within the (unexported) .perConditionSimilaryIndex function, so it is indeed a weird error message.

I am using HTSFilter 1.12.0 (see edit below). I will update my post with the session info. Here is a modified version of the data that still throws the error on my system: http://pastebin.com/p2ZXAZVY

Using that data, I get the error when I run the following: filter <- HTSFilter(x = dat, conds = names(dat), normalization = "none")

Edit: I just upgraded Bioconductor and I am now running HTSFilter 1.14.0. The error still occurs.

I figured out the issue. One of my libraries only has one replicate. When I remove this column from the data it is able to proceed. So I guess the question is, if I really need that library in my expression matrix what can I do?

Answer: Error when trying to run HTSFilter
2
3.0 years ago by
andrea.rau60
INRA / Jouy en Josas, France
andrea.rau60 wrote:

If you have one condition that has a single replicate, I would think there are two options you could try:

1) For the purposes of filtering via HTSFilter, you could just assimilate that singleton sample with the replicates from the closest/most similar condition (i.e., just in the filtering step, not for the actual differential analysis).

2) You could simply use an alternative filter, such as filtering genes with a mean normalized count across all samples < a specified value (which can easily be done using a call to the HTSBasicFilter function within the HTSFilter package) or the independent filters proposed within the edgeR pipeline (using a minimum CPM value in at least a given number of samples, see page 11 of their vignette) or the DESeq2 pipeline (see the independentFiltering argument of their results function).