I'm carrying on part of a ChIP-Seq differential analysis for the first time.
I have 6 replicates and 3 conditions; in each condition the first 3 replicates have been run separately from the other 3.
I have a couple of questions:
1) In the analysis, the consensus profile is calculated like so:
peakset <- dba(sampleSheet="/homes/fgastaldello/sumodiff/sampleSheet.csv") peakset <- dba.peakset(peakset, consensus=DBA_CONDITION, minOverlap=3)
Is it correct to assume that peaks are pooled together by condition and only those that satisfy the minOverlap parameter are included in the profile?
2) After generating the counts using the standard score for the dba.count function, a mask is created to separate conditions and the first 3 replicates from the other 3. A contrast is generated and the differential analysis is performed like so:
batch.mask <- dba.mask(peakset,DBA_REPLICATE, c(1,2,3)) peakset <- dba.contrast(peakset, categories=DBA_CONDITION, block=batch.mask) peaks.de <- dba.analyze(peakset,bParallel = FALSE,filter=?,filterFun=??)
I would like to remove those intervals/peaks that contains a count lower than filter? From what I understood from the documentation and other posts, filterFun is applied to each interval and discard anything that is less than filter.
But are the intervals/peaks separated by blocks? By conditions as specified in the contrast?
filterFun can be customized, but is there any example that I can use as a reference?
> sessionInfo() R version 3.3.1 (2016-06-21) Platform: x86_64-redhat-linux-gnu (64-bit) Running under: Red Hat Enterprise Linux locale:  LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8  LC_MESSAGES=en_GB.UTF-8 LC_PAPER=en_GB.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C  LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages:  grid parallel stats4 stats graphics grDevices utils datasets methods base other attached packages:  DBChIP_1.18.0 data.table_1.10.4 cowplot_0.9.1 ggplot2_2.2.1 kableExtra_0.6.0  DESeq_1.26.0 lattice_0.20-34 locfit_1.5-9.1 edgeR_3.16.5 limma_3.30.13  knitr_1.15.1 BiocInstaller_1.24.0 plyr_1.8.4 GenomicFeatures_1.26.3 AnnotationDbi_1.36.2  biomaRt_2.30.0 Gviz_1.18.2 DiffBind_2.2.9 SummarizedExperiment_1.4.0 Biobase_2.34.0  GenomicRanges_1.26.4 GenomeInfoDb_1.10.3 IRanges_2.8.2 S4Vectors_0.12.2 BiocGenerics_0.20.0 loaded via a namespace (and not attached):  amap_0.8-14 colorspace_1.3-2 rjson_0.2.15 hwriter_1.3.2  rprojroot_1.2 biovizBase_1.22.0 htmlTable_1.9 XVector_0.14.1  base64enc_0.1-3 dichromat_2.0-0 interactiveDisplayBase_1.12.0 xml2_1.1.1  splines_3.3.1 fail_1.3 geneplotter_1.52.0 Formula_1.2-1  Rsamtools_1.26.1 annotate_1.52.1 cluster_2.0.6 GO.db_3.4.0  pheatmap_1.0.8 graph_1.52.0 shiny_1.0.0 readr_1.1.1  httr_1.2.1 GOstats_2.40.0 backports_1.0.5 assertthat_0.1  Matrix_1.2-8 lazyeval_0.2.0 acepack_1.4.1 htmltools_0.3.5  tools_3.3.1 gtable_0.2.0 Category_2.40.0 systemPipeR_1.8.1  dplyr_0.5.0 ShortRead_1.32.1 Rcpp_0.12.10 Biostrings_2.42.1  gdata_2.17.0 rtracklayer_1.34.2 stringr_1.2.0 rvest_0.3.2  mime_0.5 ensembldb_1.6.2 gtools_3.5.0 XML_3.98-1.5  AnnotationHub_2.6.5 zlibbioc_1.20.0 scales_0.4.1 BSgenome_1.42.0  VariantAnnotation_1.20.3 hms_0.3 RBGL_1.50.0 RColorBrewer_1.1-2  BBmisc_1.11 yaml_2.1.14 memoise_1.0.0 gridExtra_2.2.1  rpart_4.1-10 latticeExtra_0.6-28 stringi_1.1.2 RSQLite_1.1-2  genefilter_1.56.0 checkmate_1.8.2 caTools_1.17.1 BiocParallel_1.8.1  BatchJobs_1.6 matrixStats_0.51.0 bitops_1.0-6 evaluate_0.10  GenomicAlignments_1.10.1 htmlwidgets_0.8 GSEABase_1.36.0 AnnotationForge_1.16.1  magrittr_1.5 sendmailR_1.2-1 R6_2.2.0 gplots_3.0.1  Hmisc_4.0-2 DBI_0.6 foreign_0.8-67 survival_2.41-2  RCurl_1.95-4.8 nnet_7.3-12 tibble_1.2 KernSmooth_2.23-15  rmarkdown_1.6 digest_0.6.12 xtable_1.8-2 httpuv_1.3.3  brew_1.0-6 munsell_0.4.3 viridisLite_0.1.3