Question: calculating TSS enrichment from ATACseq data
0
gravatar for tangming2005
6 months ago by
tangming2005150
United States
tangming2005150 wrote:

How to calculate the TSS enrichment score according to the ENCODE standard https://www.encodeproject.org/data-standards/terms/#enrichment?

I was reading the source code https://github.com/jianhong/ATACseqQC/blob/master/R/TSSEscore.R#L80 but think it is not doing it correctly.

library(TxDb.Hsapiens.UCSC.hg19.knownGene)

txs <- transcripts(TxDb.Hsapiens.UCSC.hg19.knownGene)
sel.center <- promoters(txs, upstream =  1000 downstream = 1000)
sel.left.flank <- flank(sel.center, width=endFlank, both=FALSE)
sel.right.flank <- flank(sel.center, width=endFlank, start=FALSE, both = FALSE)

cvg<-cov(gr)

sel.center<- as(sel.center, "IntegerRangesList")
vws.center <- Views(cvg, sel.center)

vws.center$chr9
Views on a 140500037-length Rle subject

views:
           start       end width
   [1]     10987     12986  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
   [2]     71698     73697  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
   [3]    213865    215864  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
   [4]    213865    215864  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
   [5]    272048    274047  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
   [6]    272048    274047  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
   [7]    272048    274047  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
   [8]    272048    274047  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
   [9]    364591    366590  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
   ...       ...       ...   ... ...
[2420] 140458452 140460451  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
[2421] 140472388 140474387  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
[2422] 140458607 140460606  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
[2423] 140472388 140474387  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
[2424] 140472388 140474387  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
[2425] 140472374 140474373  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
[2426] 140483938 140485937  2000 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...]
[2427] 140512309 140514308  2000 [ ]
[2428] 140786023 140788022  2000 [ ]

> vws.center$chr9[[2428]]
Error in getListElement(x, i, ...) : view is out of limits
> vws.center$chr9[[2426]]
integer-Rle of length 2000 with 3 runs
  Lengths: 969  78 953
  Values :   0   1   0
> vws.center$chr9[[2427]]
Error in getListElement(x, i, ...) : view is out of limits
> vws.center$chr9[[2425]]
integer-Rle of length 2000 with 7 runs
  Lengths: 782 186  11  13   7 173 828
  Values :   0   1   0   1   2   1   0

I want to get the matrix out and better to keep the tx id for each row (how should I do it?) but this gives me an error:

sapply(1:length(vws.center$chr9), function(i) as.numeric(vws.center$chr9[[i]]))
Error in getListElement(x, i, ...) : view is out of limits

essentially, I want a matrix of #tss x 2000, 2kb around the tss and each value is the number of read counts overlap with that base. (I also want to flip the rows of the matrix when the strand is -)

BTW, I have read https://support.bioconductor.org/p/56390/ very old thread and https://stat.ethz.ch/pipermail/bioc-devel/2016-June/009446.html

Thanks very much.

genomicranges views tss atacseq • 149 views
ADD COMMENTlink modified 6 months ago • written 6 months ago by tangming2005150

or if there is any package there for this purpose, I want to take a look at the source code and learn.

ADD REPLYlink written 6 months ago by tangming2005150

genomation has some functions https://bioconductor.org/packages/devel/bioc/vignettes/genomation/inst/doc/GenomationManual.html I am reading the source code. Thanks.

ADD REPLYlink written 6 months ago by tangming2005150
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 443 users visited in the last hour