Error when trying to run HTSFilter
1
0
Entering edit mode
snamjoshi87 ▴ 40
@snamjoshi87-11184
Last seen 7.1 years ago

I have normalized data that I am trying to filter with HTSFilter. Here is what I tried to run:

filter <- HTSFilter(x = nc, conds = conds, normalization = "none")

nc is a 14543 x 16 dataframe containing genes as row names and library names as column names. Each column of the dataframe is a numeric vector.

conds is a character vector of length 16 that is identical to the names of the column in the data frame. Some names are duplicated which indicates replicate libraries.

The error I get upon running is:

Error in 1:(nbindiv) : argument of length 0

I looked at the Sultan data in the vignette and couldn't really see how my inputs to the function differed other than the fact my values are continuous and the Sultan data is discrete. The function runs fine on the Sultan data.

Looking through the code on GitHub it looks like somehow the nbindiv variable gets assigned a numeric(0) somewhere but I couldn't quite figure out why this is happening.

Session info:

R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8      
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8  
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C             
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C      

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods 
[9] base    

other attached packages:
 [1] DESeq2_1.12.4              SummarizedExperiment_1.2.3 GenomicRanges_1.24.3     
 [4] GenomeInfoDb_1.8.7         magrittr_1.5               HTSFilter_1.12.0         
 [7] RDAVIDWebService_1.10.0    GOstats_2.38.1             Category_2.38.0          
[10] Matrix_1.2-7.1             AnnotationDbi_1.34.4       IRanges_2.6.1            
[13] S4Vectors_0.10.3           Biobase_2.32.0             graph_1.50.0             
[16] BiocGenerics_0.18.0        ggplot2_2.1.0            

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.7            XVector_0.12.1         RColorBrewer_1.1-2   
 [4] plyr_1.8.4             zlibbioc_1.18.0        bitops_1.0-6         
 [7] tools_3.3.1            rpart_4.1-10           annotate_1.50.1      
[10] RSQLite_1.0.0          gtable_0.2.0           lattice_0.20-34      
[13] DBI_0.5-1              gridExtra_2.2.1        rJava_0.9-8          
[16] DESeq_1.24.0           cluster_2.0.5          genefilter_1.54.2    
[19] locfit_1.5-9.1         nnet_7.3-12            grid_3.3.1           
[22] data.table_1.9.6       GSEABase_1.34.1        BiocParallel_1.6.6   
[25] XML_3.98-1.4           RBGL_1.48.1            survival_2.39-5      
[28] foreign_0.8-67         latticeExtra_0.6-28    Formula_1.2-1        
[31] limma_3.28.21          GO.db_3.3.0            geneplotter_1.50.0   
[34] edgeR_3.14.0           Hmisc_3.17-4           scales_0.4.0         
[37] splines_3.3.1          AnnotationForge_1.14.2 xtable_1.8-2         
[40] colorspace_1.2-6       acepack_1.3-3.3        RCurl_1.95-4.8       
[43] munsell_0.4.3          chron_2.3-47         

htsfilter • 1.6k views
ADD COMMENT
0
Entering edit mode

Hi and thanks for your question. I think I'll need some additional information, as I'm not currently able to replicate your error. In particular, could you send me a minimal reproducible example that throws the same error that you're seeing so that I can work off that? Also, what version of HTSFilter are you currently using? The nbindiv argument is just defined as the number of columns in the (log normalized) data within the (unexported) .perConditionSimilaryIndex function, so it is indeed a weird error message.

ADD REPLY
0
Entering edit mode

I am using HTSFilter 1.12.0 (see edit below). I will update my post with the session info. Here is a modified version of the data that still throws the error on my system: http://pastebin.com/p2ZXAZVY

Using that data, I get the error when I run the following: filter <- HTSFilter(x = dat, conds = names(dat), normalization = "none")

Edit: I just upgraded Bioconductor and I am now running HTSFilter 1.14.0. The error still occurs.

ADD REPLY
0
Entering edit mode

I figured out the issue. One of my libraries only has one replicate. When I remove this column from the data it is able to proceed. So I guess the question is, if I really need that library in my expression matrix what can I do?

ADD REPLY
2
Entering edit mode
andrea.rau ▴ 80
@andrearau-7032
Last seen 23 months ago
INRAE / Jouy en Josas, France

If you have one condition that has a single replicate, I would think there are two options you could try:

1) For the purposes of filtering via HTSFilter, you could just assimilate that singleton sample with the replicates from the closest/most similar condition (i.e., just in the filtering step, not for the actual differential analysis).

2) You could simply use an alternative filter, such as filtering genes with a mean normalized count across all samples < a specified value (which can easily be done using a call to the HTSBasicFilter function within the HTSFilter package) or the independent filters proposed within the edgeR pipeline (using a minimum CPM value in at least a given number of samples, see page 11 of their vignette) or the DESeq2 pipeline (see the independentFiltering argument of their results function).

ADD COMMENT

Login before adding your answer.

Traffic: 960 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6