Dear BioC users,
I am currently using
csaw to analyse my ChIP-seq data. For a given histone modification, I have 3 conditions, each with two biological replicates, and a matched input control library. Before going through the differential binding analysis, I would like to require a 3-fold or greater increase in abundance over the matched control to retain a window. According to the vignette, it is pretty clear how to process when you deal with a common input for all ChIPped libraries :
Let's say that the vector bam.files contains the names of 4 .bam files related to 4 ChIPped libraries, and that the name of the .bam related to the input control for all of these libraries is input.bam.
in.demo <- windowCounts(c(bam.files, "input.bam"), ext=frag.len, param=param) chip <- in.demo[,1:4] control <- in.demo[,5] in.binned <- windowCounts(c(bam.files, "input.bam"), bin=TRUE, width=10000, param=param) chip.binned <- in.binned[,1:4] control.binned <- in.binned[,5] filter.stat <- filterWindows(chip, control, type="control", prior.count=5) keep <- filter.stat$filter > log2(3)
My problem is that there is no specific advice to deal with as many inputs as ChIPped libraries.
Is someone has had a similar question and could propose a solution to this issue ?
So far I have tried the following :
in.demo <- windowCounts(c(chipped.files, input.files), ext=frag.len, param=param) chip <- in.demo[,1:4] control <- in.demo[,5:8] in.binned <- windowCounts(c(chipped.files, input.files), bin=TRUE, width=10000, param=param) chip.binned <- in.binned[,1:4] control.binned <- in.binned[,5:8] filter.stat <- filterWindows(chip, control, type="control", prior.count=5)
Which returns :
Error in getWidths(background) : need to specify read lengths in 'data$rlen'
Thanks a lot for your advice !
R version 3.2.1 (2015-06-18)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.5 (Yosemite)
attached base packages:
 stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
 csaw_1.2.1 Rsamtools_1.20.5 SeqGL_1.1.2 WGCNA_1.47
 RSQLite_1.0.0 DBI_0.3.1 fastcluster_1.1.16 dynamicTreeCut_1.62
 spams_2.5 lattice_0.20-33 ChIPKernels_1.1 Matrix_1.2-2
 kernlab_0.9-22 sfsmisc_1.0-28 gtools_3.5.0 BSgenome_1.36.3
 rtracklayer_1.28.10 GenomicRanges_1.20.8 GenomeInfoDb_1.4.3 Biostrings_2.36.4
 XVector_0.8.0 IRanges_2.2.9 S4Vectors_0.6.6 BiocGenerics_0.14.0
loaded via a namespace (and not attached):
 Rcpp_0.12.1 GO.db_3.1.2 digest_0.6.8 foreach_1.4.3
 plyr_1.8.3 futile.options_1.0.0 acepack_1.3-3.3 ggplot2_1.0.1
 zlibbioc_1.14.0 GenomicFeatures_1.20.6 rpart_4.1-10 preprocessCore_1.30.0
 proto_0.3-10 splines_3.2.1 BiocParallel_1.2.22 stringr_1.0.0
 foreign_0.8-66 biomaRt_2.24.1 RCurl_1.95-4.7 munsell_0.4.2
 nnet_7.3-11 gridExtra_2.0.0 edgeR_3.10.5 Hmisc_3.17-0
 codetools_0.2-14 matrixStats_0.14.2 XML_3.98-1.3 reshape_0.8.5
 GenomicAlignments_1.4.2 MASS_7.3-44 bitops_1.0-6 grid_3.2.1
 gtable_0.1.2 magrittr_1.5 scales_0.3.0 stringi_0.5-5
 impute_1.42.0 reshape2_1.4.1 doParallel_1.0.10 limma_3.24.15
 latticeExtra_0.6-26 futile.logger_1.4.1 Formula_1.2-1 lambda.r_1.1.7
 RColorBrewer_1.1-2 iterators_1.0.8 tools_3.2.1 Biobase_2.28.0
 survival_2.38-3 AnnotationDbi_1.30.1 colorspace_1.2-6 cluster_2.0.3