error message when I use the bsseq read.bismark command to read a genome wide cytosine report generated by bismark.
3
1
Entering edit mode
@richardallenfriedmanbrooklyn-24118
Last seen 8 months ago
United States

Dear List,

I am trying to read a genome wide cytosine report generated by bismark into bsseq with the read.bismark command and am getting an error message. I am able to read a coverage report without difficulty. Here are the relevant portions of my Rout file.


> library(bsseq)
.
.
.
> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.6

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] bsseq_1.36.0                SummarizedExperiment_1.30.2
 [3] Biobase_2.60.0              MatrixGenerics_1.12.3      
 [5] matrixStats_1.0.0           GenomicRanges_1.52.0       
 [7] GenomeInfoDb_1.36.1         IRanges_2.34.1             
 [9] S4Vectors_0.38.1            BiocGenerics_0.46.0        

loaded via a namespace (and not attached):
 [1] bitops_1.0-7              gtools_3.9.4             
 [3] lattice_0.21-8            sparseMatrixStats_1.12.2 
 [5] grid_4.3.1                R.oo_1.25.0              
 [7] Matrix_1.6-0              R.utils_2.12.2           
 [9] restfulr_0.0.15           limma_3.56.2             
[11] BSgenome_1.68.0           scales_1.2.1             
[13] permute_0.9-7             HDF5Array_1.28.1         
[15] XML_3.99-0.14             Biostrings_2.68.1        
[17] codetools_0.2-19          abind_1.4-5              
[19] cli_3.6.1                 rlang_1.1.1              
[21] crayon_1.5.2              XVector_0.40.0           
[23] R.methodsS3_1.8.2         munsell_0.5.0            
[25] DelayedArray_0.26.7       yaml_2.3.7               
[27] S4Arrays_1.0.5            tools_4.3.1              
[29] parallel_4.3.1            BiocParallel_1.34.2      
[31] colorspace_2.1-0          Rhdf5lib_1.22.0          
[33] locfit_1.5-9.8            GenomeInfoDbData_1.2.10  
[35] Rsamtools_2.16.0          R6_2.5.1                 
[37] BiocIO_1.10.0             lifecycle_1.0.3          
[39] rhdf5_2.44.0              rtracklayer_1.60.0       
[41] zlibbioc_1.46.0           data.table_1.14.8        
[43] Rcpp_1.0.11               GenomicAlignments_1.36.0 
[45] rhdf5filters_1.12.1       rjson_0.2.21             
[47] DelayedMatrixStats_1.22.1 compiler_4.3.1           
[49] RCurl_1.98-1.12          
> 
> infile="/Users/raf4/Documents/learning_statistics/learning_methylation/bsseq/4_readbismark/data/test_data_cytorep.txt.CX_report.sort.txt"
> length(infile)
[1] 1
> infile
[1] "/Users/raf4/Documents/learning_statistics/learning_methylation/bsseq/4_readbismark/data/test_data_cytorep.txt.CX_report.sort.txt"
> 
> tempdir="/Users/raf4/Documents/learning_statistics/learning_methylation/bsseq/4_readbismark/workingdir4/"
> 
> bsseq= read.bismark(file=infile,colData=NULL,rmZeroCov=FALSE,strandCollapse=TRUE,BACKEND="HDF5Array",
+ dir=tempdir,verbose=TRUE)
.
.
.
[read.bismark] Parsing files and constructing valid loci ...
Done in 729.3 secs
[read.bismark] Parsing files and constructing 'M' and 'Cov' matrices ...
Error in !bpok : invalid argument type
Calls: read.bismark -> .constructCounts
Execution halted

I would appreciate any suggests for overcoming this problem.

Thanks and best wishes,

Rich

Richard Friedman, Columbia University Cancer Center

bsseq • 1.1k views
ADD COMMENT
3
Entering edit mode
@james-w-macdonald-5106
Last seen 19 hours ago
United States

Looks like a bug to me. If you ask for an HDF5Array backend, it ends up running this bit of code

if (!all(bpok(counts))) {
        stop(".constructCounts() encountered errors for these files:\n  ",
             paste(files[!bpok], collapse = "\n  "))
    }

Where the counts item is a list generated using bplapply, and the bpok function is used check each one for any errors in reading the files. That part is fine. But then the return message results in an error, because !bpok isn't a thing. You can get that error using any function.

> library(BiocParallel)
> !bpok
Error in !bpok : invalid argument type

## but we can get that same error with any random function
> !ls
Error in !ls : invalid argument type

I imagine that bit of code should be something like

if (!all(bpok(counts))) {
        stop(".constructCounts() encountered errors for these files:\n  ",
             paste(files[!bpok(counts)], collapse = "\n  "))
    }

If Kasper Daniel Hansen doesn't respond to this post in the next few days, you should put in a bug report at his lab GitHub

1
Entering edit mode
@richardallenfriedmanbrooklyn-24118
Last seen 8 months ago
United States

Jim,

Thank you as always. If I have already posted on the github site, but I will link to your reply, which I am sure will help.

Best wishes, Rich

0
Entering edit mode

I'll follow up on GitHub when I have time, hopefully this week

ADD REPLY
0
Entering edit mode

Peter,

Thank you very much.

Best wishes,

Rich

1
Entering edit mode
Peter Hickey ▴ 740
@petehaitch
Last seen 8 days ago
WEHI, Melbourne, Australia

Just to close this off, the underlying issue was insufficient RAM to load in the particular example dataset coupled with an unhelpful error message or random crash depending on the system the code was run on. Details in https://github.com/hansenlab/bsseq/issues/125.

ADD COMMENT

Login before adding your answer.

Traffic: 567 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6