Hi I have manually spcified certain regions upstream of TSS that I am trying to normalize using csaw. I am currently using the following code to get the read counts from the bam file. But whenever I run region counts using the following parameters I get zero counts in my assay matrix. I am wondering why this is the case. Another approach I have used to get the normalized TMM value for upstream regions is to use featureCounts program and then normalize the reads using DESeq TMM normalization. But in that case the value of total counts would be only the counts falling inside the upstream regions and not the number of reads in the whole library. I was wondering if that would be the right approach to use rather than trying to stick with csaw.
Below is the code that gives zero counts with csaw.
library(csaw) bam.files <- c("wgEncodeLicrHistoneLiverH3k27acMAdult8wksC57bl6StdAlnRep1.bam") TSS1000_mouse <- read.table("TSSprofile_mouse1000.txt",sep="\t",header=F) colnames(TSS1000_mouse) <- c("chr","start","end","strand","ensembl_id") TSS1000_mouse_granges <- makeGRangesFromDataFrame(TSS1000_mouse) param <- readParam(minq=50) frag.len <- 36 reg.counts <- regionCounts(bam.files,TSS1000_mouse_granges,ext=frag.len,param=param) head(assay(reg.counts)) assay(reg.counts) tail(assay(reg.counts))
I have checked there are no non-zero values in assay(reg.counts). I was wondering if I am doing something wrong here or any parameter values I need to change because for the same dataset with default parameter featureCounts gives me counts that I can further normalize using DESeq (Kindly let me know if this would be a correct approach to use so I can go ahead with it).
PS: I am using the ENCODE CHIP-seq data and make comparisons with other species in the TSS regions.I have tried approaches like RPM to normalize the counts in the upstream of TSS (different sizes 200,500,1kb) and now I am trying to see if TMM would yield different results.