Question: calculating TSS enrichment from ATACseq data
0
6 months ago by
tangming2005150
United States
tangming2005150 wrote:

How to calculate the TSS enrichment score according to the ENCODE standard https://www.encodeproject.org/data-standards/terms/#enrichment?

I was reading the source code https://github.com/jianhong/ATACseqQC/blob/master/R/TSSEscore.R#L80 but think it is not doing it correctly.

library(TxDb.Hsapiens.UCSC.hg19.knownGene)

txs <- transcripts(TxDb.Hsapiens.UCSC.hg19.knownGene)
sel.center <- promoters(txs, upstream =  1000 downstream = 1000)
sel.left.flank <- flank(sel.center, width=endFlank, both=FALSE)
sel.right.flank <- flank(sel.center, width=endFlank, start=FALSE, both = FALSE)

cvg<-cov(gr)

sel.center<- as(sel.center, "IntegerRangesList")
vws.center <- Views(cvg, sel.center)

vws.center$chr9 Views on a 140500037-length Rle subject views: start end width [1] 10987 12986 2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] [2] 71698 73697 2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] [3] 213865 215864 2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] [4] 213865 215864 2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] [5] 272048 274047 2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] [6] 272048 274047 2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] [7] 272048 274047 2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] [8] 272048 274047 2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] [9] 364591 366590 2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] ... ... ... ... ... [2420] 140458452 140460451 2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] [2421] 140472388 140474387 2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] [2422] 140458607 140460606 2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] [2423] 140472388 140474387 2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] [2424] 140472388 140474387 2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] [2425] 140472374 140474373 2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] [2426] 140483938 140485937 2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...] [2427] 140512309 140514308 2000 [ ] [2428] 140786023 140788022 2000 [ ] > vws.center$chr9[[2428]]
Error in getListElement(x, i, ...) : view is out of limits
> vws.center$chr9[[2426]] integer-Rle of length 2000 with 3 runs Lengths: 969 78 953 Values : 0 1 0 > vws.center$chr9[[2427]]
Error in getListElement(x, i, ...) : view is out of limits
> vws.center$chr9[[2425]] integer-Rle of length 2000 with 7 runs Lengths: 782 186 11 13 7 173 828 Values : 0 1 0 1 2 1 0  I want to get the matrix out and better to keep the tx id for each row (how should I do it?) but this gives me an error: sapply(1:length(vws.center$chr9), function(i) as.numeric(vws.center\$chr9[[i]]))
Error in getListElement(x, i, ...) : view is out of limits


essentially, I want a matrix of #tss x 2000, 2kb around the tss and each value is the number of read counts overlap with that base. (I also want to flip the rows of the matrix when the strand is -)

Thanks very much.

genomicranges views tss atacseq • 149 views
modified 6 months ago • written 6 months ago by tangming2005150

or if there is any package there for this purpose, I want to take a look at the source code and learn.