Entering edit mode
Dear all,
I have an issue with the csaw
package and the filterWindowsLocal
function to call enriched signal on an ATAC-seq sample.
To make it short, I'm correcting the background for copy number effect, and then run the function as follow ;
## Read the input data
atac.counts <- windowCounts(in.bam, width=150, ext=1, param=param.atac)
neighbor <- suppressWarnings(resize(rowRanges(atac.counts), width=500, fix="center"))
wider <- regionCounts(in.bam, regions=neighbor, param=param.atac)
## correctCNV() is my own function with only multiply the assay() by a scaling factor
atac.cor <- correctCNV(atac.counts)
wider.cor <- correctCNV(wider)
## call peaks
filter.stat <- filterWindowsLocal(atac.cor, wider.cor)
Erreur : Negative counts not allowed
The point is that I have no negative values in these objects and all values are integer
;
> summary(assay(wider.cor))
V1
Min. : 3.00
1st Qu.: 34.00
Median : 47.00
Mean : 67.02
3rd Qu.: 69.00
Max. :8471.00
> summary(assay(atac.cor))
V1
Min. : 2.00
1st Qu.: 11.00
Median : 14.00
Mean : 18.86
3rd Qu.: 19.00
Max. :5015.00
Going further, I was able to reproduce the error with a subset of values ...
## works well
filter.stat <- filterWindowsLocal(atac.cor[30:35], wider[30:35])
## Error
> filter.stat <- filterWindowsLocal(atac.cor[35:40], wider[35:40])
Erreur : Negative counts not allowed
> str(assay(atac.cor[30:35]))
int [1:6, 1] 17 15 15 15 15 19
> str(assay(wider.cor[30:35]))
int [1:6, 1] 92 67 57 53 54 50
> str(assay(atac.cor[35:40]))
int [1:6, 1] 19 18 32 57 64 38
> str(assay(wider.cor[35:40]))
int [1:6, 1] 50 42 81 78 78 76
And I do not see here any difference which may explain the error ?
Any help/comment is welcome. Thank you very much
Nicolas
> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS: /bioinfo/local/build/Centos/R/R-4.1.0/lib64/R/lib/libRblas.so
LAPACK: /bioinfo/local/build/Centos/R/R-4.1.0/lib64/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C
[3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8
[5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8
[7] LC_PAPER=fr_FR.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices datasets utils
[8] methods base
other attached packages:
[1] DNAcopy_1.66.0 optparse_1.6.6
[3] WGSmapp_1.4.0 BSgenome.Hsapiens.UCSC.hg38_1.4.3
[5] BSgenome_1.60.0 Biostrings_2.60.2
[7] XVector_0.32.0 rtracklayer_1.52.1
[9] gridExtra_2.3 ggplot2_3.3.5
[11] edgeR_3.34.1 limma_3.48.3
[13] csaw_1.26.0 SummarizedExperiment_1.22.0
[15] Biobase_2.52.0 MatrixGenerics_1.4.3
[17] matrixStats_0.61.0 GenomicRanges_1.44.0
[19] GenomeInfoDb_1.28.4 IRanges_2.26.0
[21] S4Vectors_0.30.0 BiocGenerics_0.38.0
[23] renv_0.13.2
loaded via a namespace (and not attached):
[1] Rcpp_1.0.7 locfit_1.5-9.4 lattice_0.20-45
[4] Rsamtools_2.8.0 utf8_1.2.2 R6_2.5.1
[7] pillar_1.6.2 zlibbioc_1.38.0 rlang_0.4.11
[10] Matrix_1.3-4 BiocParallel_1.26.2 RCurl_1.98-1.5
[13] munsell_0.5.0 DelayedArray_0.18.0 compiler_4.1.0
[16] pkgconfig_2.0.3 tidyselect_1.1.1 tibble_3.1.4
[19] GenomeInfoDbData_1.2.6 metapod_1.0.0 XML_3.99-0.8
[22] fansi_0.5.0 crayon_1.4.1 dplyr_1.0.7
[25] withr_2.4.2 GenomicAlignments_1.28.0 bitops_1.0-7
[28] grid_4.1.0 gtable_0.3.0 lifecycle_1.0.0
[31] magrittr_2.0.1 scales_1.1.1 getopt_1.20.3
[34] ellipsis_0.3.2 generics_0.1.0 vctrs_0.3.8
[37] rjson_0.2.20 restfulr_0.0.13 tools_4.1.0
[40] glue_1.4.2 purrr_0.3.4 yaml_2.2.1
[43] colorspace_2.0-2 BiocManager_1.30.16 BiocIO_1.2.0
Sorry, I made a mistake in the previous post. Here is the correction ;
And now, we do see that the error is due to the fact that at a given position (39 here), the signal > background.
And as the
filterWindowsLocal
is running abackground - data
operation, it explains why I had negative values.To sum up,
filterWindowsLocal
only works if the background is > than the signal ... which makes sense as the background must be computed from larger regions. Sorry for the spam :)N