TFBSTools pvalues function hangs on certain inputs
1
0
Entering edit mode
@franceskrussell-7529
Last seen 6.5 years ago
Canada

I am using the TFBSTools package to predict transcription factor binding sites in a set of sequences.  I am trying to use the pvalues function on SiteSet objects so I can filter out insignificant hits.  While this works for some SiteSet objects, on others the function just hangs.  I can't see any pattern in which objects that cause it to hang and which don't.  Here is a script to reproduce it (the function hangs on item 57 in the list):

library(TFBSTools)
library(JASPAR2014)
library(Biostrings)

# Get all PWMs from JASPAR database
opts = list()
opts[["species"]] <- 9606 # human
opts[["all_versions"]] <- TRUE
opts[["matrixtype"]] <- "PWM"
pwmatrices = getMatrixSet(JASPAR2014, opts)

seq <- "ATGGCATACAGTCGGTAAAGCGTAGTGCTTGAGAGCATGGATTATGGAGACTATTTAAATACTGGCTCCATAACTTAATAGCTTGGGACCCACGGTGTTACTTAGCTTCTATCTGCTTTAATTTACTCATCTGTAAACTTGGGATAAGATACTTCCTCATAAGGTTGGTGTGAAGACCAAGTGAATTAACTATCGTTTAAAGCACTTACAAAAGTGCCGGGCACCACCGAGATATGCATCCGTTAGCTTTTATTATTATTAGACTCAAAACACTGTAGTAGTTCTAATGAGAGGGGTAAGAATCAAAAATCCAGGCACCTGCATAGAGCCAGAGAGGCACACATAGAAGCAACGTAAGAGTGGAAGCGGAATGAAAACATGCTAAAGCCAGGTACAAGCCACAAGCGAGGGTCCACAGGAAGAAATTGTTAATTCTGAAGAGAGTGAATGCACGAAGTTACAGGAAAAATAACATCTGAACAGAGTTTAAGAATGAGCAGGACTTCAACAAGTGGCTAGTAAGACATAAGGAACCTACAAAAGATCTTAGCAAAGGCGCAAAGATTACCATCGTATTGCTCGTTTCTTCCTACTTTGCAGAAGTAACCTCTGGCGAACAGAGGTGGTTGCAGAGCATGCTTATCAAGCAAAATACCACGAAGCAGTAAGGAACGACAGAGATAACAGTAACAATAATAATTCACCCCAAGGTACTCAACTGGAAAAAGGAAATACAGAGGAGAGGTGTCGTTAAGAAAGCCAGGACGCACATCACGGCCCCGTCGCTGCACTACTCTCGTCTAGGGGTCAACAGTGGAGTCGAGACTCGAAGCTTCCACGCGGCGGAACAGCGTCCCTCTCAGGCGGCGAACGGGCTAGGGAAGCGCCCGGAGGAGACCTAGCGTGAGAACTACAACTCCCGCGGAGCCCGAGGGCGAGCTGCCTGCGTAACTTCCGCTTCCGCCACCTGCCCCTCTCACCCTCTTCACTCGAACCCTAC"
sequencename <- "M6PR"
sitesetList <- searchSeq(pwmatrices, seq, seqname=sequencename, min.score="80%", strand="*")
pvalues(sitesetList[[1]])
pvalues(sitesetList[[2]])
pvalues(sitesetList[[57]])

 

 

Here is the output of sessionInfo():

R version 3.1.2 (2014-10-31)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] TFBSTools_1.4.0

loaded via a namespace (and not attached):
 [1] base64enc_0.1-2            BatchJobs_1.5              BBmisc_1.9                
 [4] BiocGenerics_0.12.1        BiocParallel_1.0.3         Biostrings_2.34.1         
 [7] bitops_1.0-6               brew_1.0-6                 BSgenome_1.34.1           
[10] caTools_1.17.1             checkmate_1.5.1            CNEr_1.2.0                
[13] codetools_0.2-11           DBI_0.3.1                  digest_0.6.8              
[16] DirichletMultinomial_1.8.0 fail_1.2                   foreach_1.4.2             
[19] GenomeInfoDb_1.2.4         GenomicAlignments_1.2.2    GenomicRanges_1.18.4      
[22] grid_3.1.2                 gtools_3.4.1               IRanges_2.0.1             
[25] iterators_1.0.7            parallel_3.1.2             Rcpp_0.11.5               
[28] RCurl_1.95-4.5             Rsamtools_1.18.3           RSQLite_1.0.0             
[31] rtracklayer_1.26.2         S4Vectors_0.4.0            sendmailR_1.2-1           
[34] seqLogo_1.32.1             stats4_3.1.2               stringr_0.6.2             
[37] TFMPvalue_0.0.5            tools_3.1.2                XML_3.98-1.1              
[40] XVector_0.6.0              zlibbioc_1.12.0           

tfbstools • 908 views
ADD COMMENT
0
Entering edit mode
Ge Tan ▴ 20
@ge-tan-7918
Last seen 3.7 years ago
Switzerland

Hi, sorry for this late reply. It doesn't hang, but indeed is much slower on case sitesetList[[57]]. Eventually it returns 0.0003068776.

By default, pvalues utilise the CRAN package TFMPvalue to calculate the pvalue. I need to look into details why it's slow on certain cases.

If you wish, pvalues(sitesetList[[57]], type="sampling") will always finish in a reasonable time.

Ge

 

ADD COMMENT

Login before adding your answer.

Traffic: 340 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6